Prosecution Insights
Last updated: May 29, 2026
Application No. 17/507,070

IDENTIFYING IMAGE SEGMENTATION QUALITY USING NEURAL NETWORKS

Final Rejection §103
Filed
Oct 21, 2021
Priority
Apr 10, 2019 — continuation of 16/380,759
Examiner
YANG, WEI WEN
Art Unit
2662
Tech Center
2600 — Communications
Assignee
Nvidia Corporation
OA Round
4 (Final)
82%
Grant Probability
Favorable
5-6
OA Rounds
0m
Est. Remaining
93%
With Interview

Examiner Intelligence

Grants 82% — above average
82%
Career Allowance Rate
548 granted / 666 resolved
+20.3% vs TC avg
Moderate +11% lift
Without
With
+10.9%
Interview Lift
resolved cases with interview
Typical timeline
2y 5m
Avg Prosecution
23 currently pending
Career history
695
Total Applications
across all art units

Statute-Specific Performance

§101
0.6%
-39.4% vs TC avg
§103
94.9%
+54.9% vs TC avg
§102
3.7%
-36.3% vs TC avg
§112
0.7%
-39.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 666 resolved cases

Office Action

§103
DETAILED ACTION Response to Arguments The amendments and arguments filed 3/9/2026 have been entered and made of record. The Applicant's amendments and arguments filed 3/9/2026 have been considered but are moot in view of the new ground(s) of rejection because the Applicant has similarly amended at least independent claims 1, 14, 23, and 30. Furthermore, the Applicant's amendments and arguments have been considered, however, are not persuasive: Re Claim 1, Applicant asserts that cited references, and particularly, Gazit as modified by Yang, do not disclose newly added limitation “identify, by one or more neural networks, segmentation features of one or more first segmentations of one or more objects within one or more images; However, the Examiner disagrees, because: Gazit clearly discloses identify segmentation features of one or more first segmentations (see Gazit: e.g., -- [0240] For segmenting or other image processing of bones, and skeletal muscles, the bounding region is optionally not required to be aligned with the principle axes of the body, but is aligned with the expected axis of the bone, which for some bones may be very different from the principle axes of the body. Particularly for long bones of the limbs and their associated muscles, the orientation of the bone may vary greatly between images, and the orientation of the bone in the image is optionally determined or estimated from the test image, based for example on the high density portion of the bone, before the bounding region is found. [0241] In the case of long thin lumens such as blood vessels, again the bounding region is optionally aligned with the expected axis of the blood vessel or lumen, rather than with the principle axes of the body, and in the case of blood vessels in the limbs, their orientation is optionally inferred from the orientation of corresponding bones in the limbs. Lumens which do not rigidly hold their shape, such as the lumens of the gastrointestinal tract, are optionally imaged when they are filled with a fluid, for example a fluid containing a contrast material such as barium in the case of CT images of the gastrointestinal tract, which will make their shape and internal intensity more predictable, and will make the lumen more visible. [0242] For segmenting or other image processing of components of organs, such as lobes of the lungs or the liver, heart valves and heart chambers, and blood vessels that belong to organs, the organ is optionally segmented first, and a bounding box of the component of the organ is optionally found relative to the bounding box of the organ, or relative to a salient feature of the organ.--, in [02040], [0242]); Apparently, above Gazit’s disclosures of “the orientation” , “shape”, “contrast”, “intensity” “a bounding box of the component of the organ is optionally found relative to the bounding box of the organ, or relative to a salient feature of the organ” are align with the claimed “segmentation features of one or more first segmentations” being identified; and, similarly, and consistently disclosed by in Yang’s disclosures: see Yang: e.g., -- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image. FIG. 2 illustrates a deep image-to-image network (DI2IN) 200 for liver segmentation according to an embodiment of the present invention. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region)….. the feature maps… a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction. --, in [0024]-[0025]; Apparently, Yang’s “feature maps” associated with liver segmentation mask, and “preserve liver shape information” align with above Gazit’s segmentation features of one or more first segmentations” being identified, and read on claimed “identify…segmentation features of one or more first segmentations” Furthermore, as Applicant admitted that Gazit as modified by Yang further disclose newly added limitation generate, by the one or more neural networks, one or more second segmentations of the one or more objects based, at least in part, on the segmentation features {herein “the second segmentations” has been consider as “regenerated segmentation(s) in previous Office Actions}: as discussed in the previous Office Action, Gazit teaches: “one or more regenerated versions of the one or more segmentations generated by one or more neural networks” as see: -- distribution of magnitudes of intensity gradient expected along a boundary of the organ, specific to the chosen set of organ intensity characteristics….. a) finding at least an approximate bounding region of the organ in the image, wherein the region of the image used for estimating organ intensity characteristics is the bounding region b) finding a new bounding region of the organ based on the segmentation of the organ; c) finding the organ intensity characteristics based on the new bounding region;…, and on the new bounding region; and e) segmenting the organ in the image again--, in [0022]-[0027]; so that it is clearly disclosed that the result of step (e) segmenting the organ in the image again read on the claimed “one or more regenerated versions of the one or more segmentations generated by one or more neural networks”; and “the segmentation of the organ” in b) finding a new bounding region of the organ based on the segmentation of the organ read on the claimed “the one or more segmentations”; in other works, the claimed “the one or more segmentations” considered as the original segmentations, and segmentations of the ground truth in the training image data, and the claimed “regenerated segmentations” is the regenerated segmentations by the generative neural networks; and such, Yang clearly discloses “generate one or more notifications to indicate quality of one or more segmentations of one or more objects…based on a score generated by comparing the one or more segmentations to one or more regenerated versions of the one or more segmentations generated by one or more neural networks” (see Yang: e.g., Fig. 6, and, -- a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth.--, in [0021], {apparently, above “to distinguish” is to compare one or more segmentations of one or more shape features of one or more images, because “segmentation masks” are the segmentations of one or more shape features of one or more images}, and, it is further illustrated the comparison of output of generated segmentations { “segmented liver boundary 802 generated using DI2IN-AN” to input of segmentations { the ground truth liver boundary 801} in Figs. 8-9, and, “Images 805 and 810 show the ground truth liver boundary 801 and segmented liver boundary 802 generated using DI2IN-AN”, in [0041]-[0042]: also see: -- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image…. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region). The prediction 204 output by the DI2IN 200 for a given input 3D CT volume 202 can be output as a probability map or a binary liver segmentation mask.--, in [0024] So that above Yang’s disclosed “a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images.” Determine/generate “one or more notifications” as discriminating/distinguishing whether the segmentations are of “segmentations” of the ground truth, or the segmentations of regenerated version); Furthermore, it is clearly disclosed in above Gazit’s para. [0022]-[0027], “distribution of magnitudes of intensity gradient expected along a boundary of the organ”, and “intensity characteristics” and “estimated organ intensity characteristics”, which are applied in the comparison and cost function as a comparison operation are representations of boundaries of the organs, or the bounding region of the organ, such as the boundaries of a liver, which the boundaries, or bounding region are the “shape feature”, because these boundaries, or bounding region define the shape of the organ (see Gazit’s [0130]), so that, Gazit’s “distribution of magnitudes of intensity gradient expected along a boundary of the organ”, and “intensity characteristics” and “estimated organ intensity characteristics” read on Application’s “label masks”, or “segmentation masks”, as the representation of the boundaries as the claimed limitation “one or more segmentations of one or more objects”, which is output by computer learning algorithm based on: --“because the intensity gradient expected for the boundary of a target organ will generally be higher for images of higher contrast.” …., and based on information on a distribution of intensities, in the target organ and/or its vicinity, is obtained first from each of a set of training images of that organ, including a variety of different organ intensity characteristics--, in [0096]-[0098]; And Gazit’s disclosures are consistent with Yang’s disclosures as discussed and applied above; and furthermore, above Gazit’s disclosures of the one or more segmentations/boundaries of the training images are considered as input of one or more segmentations/boundaries of objects, being compared with computer generated output of one or more segmentations/boundaries of objects {organs} in representations of “the intensity gradient expected for the boundary of a target organ” and “the organ intensity characteristics based on the new bounding region”; as Gazit discloses: -- the intermediate results, such as a set of organ intensity characteristics determined for an input image, or image processing parameters to be used for processing an input image, or a segmentation mask of the lungs, and the final output, such as a smoothed image, or a segmented image, or a segmentation mask of a target organ--, in [0083]-[0084]; Gazit and Yang are combinable as they are in the same field of endeavor: segmentation using neural network and boundaries identification and comparison. Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Gazit’s processor using Yang’s teachings by including generate one or more notifications to indicate quality of one or more segmentations of one or more objects…based on a score generated by comparing the one or more segmentations to one or more regenerated versions of the one or more segmentations generated by one or more neural networks {such as comparing ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images} to Gazit’s learning machine in order to perform and improve automated computer-based liver segmentation in 3D medical images (see Yang: e.g. in [0005], and [0019]-[0024], [0041]-[0042]). Therefore, claims 1-35 are still not patentably distinguishable over the prior art reference(s). Further discussions are addressed in the prior art rejection section below. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-5, 13-35 is rejected under 35 U.S.C. 103 as being unpatentable over Gazit (US 20160300351 A1, as provided in IDS); and in view of Yang (US 20180260957 A1, as provided in IDS). Re Claim 1, Gazit discloses one or more processors, comprising: circuitry to: identify, by one or more neural networks, segmentation features of one or more first segmentations of one or more objects within one or more images (see Gazit: e.g., -- [0240] For segmenting or other image processing of bones, and skeletal muscles, the bounding region is optionally not required to be aligned with the principle axes of the body, but is aligned with the expected axis of the bone, which for some bones may be very different from the principle axes of the body. Particularly for long bones of the limbs and their associated muscles, the orientation of the bone may vary greatly between images, and the orientation of the bone in the image is optionally determined or estimated from the test image, based for example on the high density portion of the bone, before the bounding region is found. [0241] In the case of long thin lumens such as blood vessels, again the bounding region is optionally aligned with the expected axis of the blood vessel or lumen, rather than with the principle axes of the body, and in the case of blood vessels in the limbs, their orientation is optionally inferred from the orientation of corresponding bones in the limbs. Lumens which do not rigidly hold their shape, such as the lumens of the gastrointestinal tract, are optionally imaged when they are filled with a fluid, for example a fluid containing a contrast material such as barium in the case of CT images of the gastrointestinal tract, which will make their shape and internal intensity more predictable, and will make the lumen more visible. [0242] For segmenting or other image processing of components of organs, such as lobes of the lungs or the liver, heart valves and heart chambers, and blood vessels that belong to organs, the organ is optionally segmented first, and a bounding box of the component of the organ is optionally found relative to the bounding box of the organ, or relative to a salient feature of the organ.--, in [02040], [0242]; {Apparently, above Gazit’s disclosures of “the orientation” , “shape”, “contrast”, “intensity” “a bounding box of the component of the organ is optionally found relative to the bounding box of the organ, or relative to a salient feature of the organ” are align with the claimed “segmentation features of one or more first segmentations” being identified}); generate, by the one or more neural networks, one or more second segmentations of the one or more objects based, at least in part, on the segmentation features (see: Gazit. E., -- distribution of magnitudes of intensity gradient expected along a boundary of the organ, specific to the chosen set of organ intensity characteristics….. a) finding at least an approximate bounding region of the organ in the image, wherein the region of the image used for estimating organ intensity characteristics is the bounding region b) finding a new bounding region of the organ based on the segmentation of the organ; c) finding the organ intensity characteristics based on the new bounding region;…, and on the new bounding region; and e) segmenting the organ in the image again--, in [0022]-[0027]; {so that it is clearly disclosed that the result of step (e) segmenting the organ in the image again read on the claimed “one or more regenerated versions of the one or more segmentations generated by one or more neural networks”; and “the segmentation of the organ” in b) finding a new bounding region of the organ based on the segmentation of the organ read on the claimed “the one or more segmentations”; in other works, the claimed “the one or more segmentations” considered as the original segmentations, and segmentations of the ground truth in the training image data, and the claimed “regenerated segmentations” is the regenerated segmentations by the generative neural networks}; above Gazit’s disclosures of the one or more segmentations/boundaries of the training images are considered as input of one or more segmentations/boundaries of objects, being compared with computer generated output of one or more segmentations/boundaries of objects {organs} in representations of “the intensity gradient expected for the boundary of a target organ” and “the organ intensity characteristics based on the new bounding region”; as Gazit discloses: -- the intermediate results, such as a set of organ intensity characteristics determined for an input image, or image processing parameters to be used for processing an input image, or a segmentation mask of the lungs, and the final output, such as a smoothed image, or a segmented image, or a segmentation mask of a target organ--, in [0083]-[0084]; and, so that, Gazit’s “distribution of magnitudes of intensity gradient expected along a boundary of the organ”, and “intensity characteristics” and “estimated organ intensity characteristics” read on Application’s “label masks”, or “segmentation masks”, as the representation of the boundaries as the claimed limitation “one or more segmentations of one or more objects”, which is output by computer learning algorithm based on: --“because the intensity gradient expected for the boundary of a target organ will generally be higher for images of higher contrast.” …., and based on information on a distribution of intensities, in the target organ and/or its vicinity, is obtained first from each of a set of training images of that organ, including a variety of different organ intensity characteristics--, in [0096]-[0098]); Gazit however does not explicitly disclose generate one or more notifications to indicate quality of the one or more first segmentations of one or more objects within one or more the images based, at least in part, on a score generated by comparing the one or more first segmentations the one or more second segmentations; Yang discloses generate one or more notifications to indicate quality of the one or more first segmentations of one or more objects within one or more the images based, at least in part, on a score generated by comparing the one or more first segmentations the one or more second segmentations (see Yang: e.g., Fig. 6, and, -- a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth.--, in [0021], {apparently, above “to distinguish” is to compare one or more segmentations of one or more shape features of one or more images, because “segmentation masks” are the segmentations of one or more shape features of one or more images}, and, it is further illustrated the comparison of output of generated segmentations { “segmented liver boundary 802 generated using DI2IN-AN” to input of segmentations { the ground truth liver boundary 801} in Figs. 8-9, and, “Images 805 and 810 show the ground truth liver boundary 801 and segmented liver boundary 802 generated using DI2IN-AN”, in [0041]-[0042]: also see: -- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image…. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region). The prediction 204 output by the DI2IN 200 for a given input 3D CT volume 202 can be output as a probability map or a binary liver segmentation mask.)….. the feature maps… a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction. --, in [0024]-[0025]; and, --The alternating of the discriminator training 502 and generator training 504 is iterated for a plurality of iterations, until the discriminator is not able to easily distinguish between the ground truth label maps (ground truth liver segmentation masks) and the predictions output by the DI2IN (predicted liver segmentation masks). For example, the discriminator training 502 and the generator training 504 can be iterated until the discriminator weights and the generator weights converge or until a predetermine number of maximum iterations is reached. The algorithm 500 outputs the final updated generator weights. After the training process, the adversarial network is no longer needed during the inference stage (110 of FIG. 1). The trained generator (DI2IN) itself can be used during the inference stage (110) to provide high quality liver segmentation, with improved performance due to the adversarial training.--, in [0032]; {So that above Yang’s disclosed “a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images.” Determine/generate “one or more notifications” as discriminating/distinguishing whether the segmentations are of “segmentations” of the ground truth, or the segmentations of regenerated version; and, Gazit’s disclosures are consistent with Yang’s disclosures as discussed and applied above}); Gazit and Yang are combinable as they are in the same field of endeavor: segmentation using neural network and boundaries identification and comparison. Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Gazit’s processor using Yang’s teachings by including generate one or more notifications to indicate quality of one or more segmentations of one or more objects within one or more images based, at least in part, on a score generated by comparing the one or more segmentations to one or more regenerated versions of the one or more segmentations generated by one or more neural networks {such as comparing ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images} to Gazit’s learning machine in order to perform and improve automated computer-based liver segmentation in 3D medical images (see Yang: e.g. in [0005], and [0019]-[0024], [0041]-[0042]). Re Claim 2, Gazit as modified by Yang further disclose wherein: the one or more images comprises a medical image; and at least one of the one or more first segmentations or one or more second segmentations represent a boundary corresponding to the one or more objects within the medical image (see Gazit: e.g., --Gaussian distributions of intensity gradient, with different mean value and standard deviation corresponding to the interior and to the boundary of the left kidney for each of four clusters of different organ intensity characteristics, made from training images according to the method of FIG. 4.--, in [0022], [0073], and, --the image is anisotropically smoothed, reducing noise in the interior and exterior of a target organ, but not blurring the sharpness of the boundaries of the organ. The parameters for doing this will in general depend on the organ intensity characteristics, especially when they are due to different results of contrast agent use or lack of use, because the intensity gradient expected for the boundary of a target organ will generally be higher for images of higher contrast. In another example, the contrast of the image is changed, in order to make a particular type of feature or structure of a target organ, for example a lesion, more easily visible when the image is viewed.--, and, --The characteristics of the image being processed are then learned by comparing information on the distribution of intensities in a region of the image, for example in or near the target organ, to the corresponding information for the different clusters of training images.--, in [0082]-[0083], [0096]-[0100], -- to include intensity gradients in the cost function of a region-growing algorithm for segmentation of the target organ, depending on information about the expected distribution of intensity gradients inside and at the boundary of the target organ in the image being processed.--, in [0104]-[0108], [0142]-[0143], [0170], [0174], and [0179]-[0180]; Gazit teaches: -- distribution of magnitudes of intensity gradient expected along a boundary of the organ, specific to the chosen set of organ intensity characteristics….. a) finding at least an approximate bounding region of the organ in the image, wherein the region of the image used for estimating organ intensity characteristics is the bounding region b) finding a new bounding region of the organ based on the segmentation of the organ; c) finding the organ intensity characteristics based on the new bounding region;…, and on the new bounding region; and e) segmenting the organ in the image again--, in [0022]-[0027]; it is clearly disclosed in above Gazit’s para. [0022]-[0027], “distribution of magnitudes of intensity gradient expected along a boundary of the organ”, and “intensity characteristics” and “estimated organ intensity characteristics”, which are applied in the comparison and cost function as a comparison operation are representations of boundaries of the organs, or the bounding region of the organ, such as the boundaries of a liver, which the boundaries, or bounding region are the “shape feature”, because these boundaries, or bounding region define the shape of the organ (see Gazit’s [0130]), so that, Gazit’s “distribution of magnitudes of intensity gradient expected along a boundary of the organ”, and “intensity characteristics” and “estimated organ intensity characteristics” read on Application’s “label masks”, or “segmentation masks”, as the representation of the boundaries being compared; as Gazit discloses: -- the intermediate results, such as a set of organ intensity characteristics determined for an input image, or image processing parameters to be used for processing an input image, or a segmentation mask of the lungs, and the final output, such as a smoothed image, or a segmented image, or a segmentation mask of a target organ--, in [0083]-[0084]; also see Yang: e.g., Fig. 6, and, -- a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth.--, in [0021]-[0024], {apparently, above “to distinguish” is to compare one or more boundaries of one or more shape features of one or more images, because “segmentation masks” are the boundaries of one or more shape features of one or more images}). Re Claim 3, Gazit as modified by Yang further disclose wherein the one or more neural networks includes a variational autoencoder trained with ground truth boundary information (see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]). Re Claim 4, Gazit as modified by Yang further disclose wherein a comparison of the one or more first segmentations to the one or more second segmentations is a value used to train the one or more neural networks (see Gazit: e.g., --for each location relative to the bounding region, that the location is in the target organ. The set of probabilities for a given organ is referred to as a "probabilistic atlas" for that organ. In order to meaningfully compare bounding regions that in general have different dimensions in different images, the bounding region in each image is mapped into the probabilistic atlas, for example by linearly scaling the different dimensions of the bounding region, using scaling factors that may be different for different directions and different images.--, in [0147], and, also see: --Gaussian distributions of intensity gradient, with different mean value and standard deviation corresponding to the interior and to the boundary of the left kidney for each of four clusters of different organ intensity characteristics, made from training images according to the method of FIG. 4.--, in [0022], [0073], and, --the image is anisotropically smoothed, reducing noise in the interior and exterior of a target organ, but not blurring the sharpness of the boundaries of the organ. The parameters for doing this will in general depend on the organ intensity characteristics, especially when they are due to different results of contrast agent use or lack of use, because the intensity gradient expected for the boundary of a target organ will generally be higher for images of higher contrast. In another example, the contrast of the image is changed, in order to make a particular type of feature or structure of a target organ, for example a lesion, more easily visible when the image is viewed.--, and, --The characteristics of the image being processed are then learned by comparing information on the distribution of intensities in a region of the image, for example in or near the target organ, to the corresponding information for the different clusters of training images.--, in [0082]-[0083], [0096]-[0100], -- to include intensity gradients in the cost function of a region-growing algorithm for segmentation of the target organ, depending on information about the expected distribution of intensity gradients inside and at the boundary of the target organ in the image being processed.--, in [0104]-[0108], [0142]-[0143], [0170], [0174], and [0179]-[0180]; Gazit teaches: -- distribution of magnitudes of intensity gradient expected along a boundary of the organ, specific to the chosen set of organ intensity characteristics….. a) finding at least an approximate bounding region of the organ in the image, wherein the region of the image used for estimating organ intensity characteristics is the bounding region b) finding a new bounding region of the organ based on the segmentation of the organ; c) finding the organ intensity characteristics based on the new bounding region;…, and on the new bounding region; and e) segmenting the organ in the image again--, in [0022]-[0027]; also see Yang: e.g., Fig. 1, and, -- The trained deep image-to-image network is trained in an adversarial network together with a discriminative network that distinguishes between predicted liver segmentation masks generated by the deep image-to-image network from input training volumes and ground truth liver segmentation masks.--, in abstract, and, -- utilize a trained deep image-to-image network to generate a liver segmentation mask from an input medical image of a patient. Embodiments of the present invention train the deep image-to-image network for liver segmentation in an adversarial network, in which the deep image-to-image network is trained together with a discriminator network that attempts to distinguish between ground truth liver segmentation masks and liver segmentation masks generated by the deep image-to-image network.--, in [0005]; and, -- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image…. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region). The prediction 204 output by the DI2IN 200 for a given input 3D CT volume 202 can be output as a probability map or a binary liver segmentation mask.--, in [0024]). Re Claim 5, Gazit as modified by Yang further disclose wherein one or more notifications indicates that the one or more first segmentations conforms to ground truth data. (see Yang: e.g., Fig. 1, and, -- The trained deep image-to-image network is trained in an adversarial network together with a discriminative network that distinguishes between predicted liver segmentation masks generated by the deep image-to-image network from input training volumes and ground truth liver segmentation masks.--, in abstract, and, -- utilize a trained deep image-to-image network to generate a liver segmentation mask from an input medical image of a patient. Embodiments of the present invention train the deep image-to-image network for liver segmentation in an adversarial network, in which the deep image-to-image network is trained together with a discriminator network that attempts to distinguish between ground truth liver segmentation masks and liver segmentation masks generated by the deep image-to-image network.--, in [0005]; and, -- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image…. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region). The prediction 204 output by the DI2IN 200 for a given input 3D CT volume 202 can be output as a probability map or a binary liver segmentation mask.--, in [0024]). Re Claim 13, Gazit as modified by and Yang further disclose wherein the processor comprises a graphical processing unit (GPU) (see Yang: e.g., -- CRF and graph cut both suffer from serious leakage in these situations. Using a NVIDIA TITAN X GPU and the Theano/Lasagne library, the run time of liver segmentation using the DI2IN-AN method is less than one second, which is significantly faster than most current approaches.--, in [0039]). Re Claim 14, claim 14 is the corresponding method claim to claim 1 respectively. Thus, claim 14 is rejected for the similar reasons as for claim 1. Furthermore, Gazit as modified by and Yang further disclose a method, using a processor comprising one or more circuits, comprising causing one or more output boundaries of one or more objects within one or more images generated by one or more neural networks to be compared to one or more input boundaries of the one or more objects to the one or more neural networks (see Yang: e.g., Fig. 6, and, -- a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth.--, in [0021]-[0024], {apparently, above “to distinguish” is to compare one or more boundaries of one or more shape features of one or more images, because “segmentation masks” are the boundaries of one or more shape features of one or more images}). Re Claim 15, Gazit as modified by Yang further disclose generating the one or more first segmentation of the image, wherein the one or more first segmentation represents a processor-determined set of boundaries of objects depicted in the image (see Gazit: e.g., -- to include intensity gradients in the cost function of a region-growing algorithm for segmentation of the target organ, depending on information about the expected distribution of intensity gradients inside and at the boundary of the target organ in the image being processed.--, in [0104]-[0108], [0142]-[0143], [0174], and [0179]-[0180]; also see Yang: e.g., Fig. 1, and, -- The trained deep image-to-image network is trained in an adversarial network together with a discriminative network that distinguishes between predicted liver segmentation masks generated by the deep image-to-image network from input training volumes and ground truth liver segmentation masks.--, in abstract); inputting the one or more first segmentations to a neural network previously trained on a collection of training segmentations (see Yang: e.g., Fig. 1, and, -- The trained deep image-to-image network is trained in an adversarial network together with a discriminative network that distinguishes between predicted liver segmentation masks generated by the deep image-to-image network from input training volumes and ground truth liver segmentation masks.--, in abstract, and, -- utilize a trained deep image-to-image network to generate a liver segmentation mask from an input medical image of a patient. Embodiments of the present invention train the deep image-to-image network for liver segmentation in an adversarial network, in which the deep image-to-image network is trained together with a discriminator network that attempts to distinguish between ground truth liver segmentation masks and liver segmentation masks generated by the deep image-to-image network.--, in [0005]; and, -- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image…. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region). The prediction 204 output by the DI2IN 200 for a given input 3D CT volume 202 can be output as a probability map or a binary liver segmentation mask.--, in [0024]); generating the one or more second segmentations of the one or more images, wherein the one or more second segmentations comprises an output of the one or more neural networks (see: Gazit. E., -- distribution of magnitudes of intensity gradient expected along a boundary of the organ, specific to the chosen set of organ intensity characteristics….. a) finding at least an approximate bounding region of the organ in the image, wherein the region of the image used for estimating organ intensity characteristics is the bounding region b) finding a new bounding region of the organ based on the segmentation of the organ; c) finding the organ intensity characteristics based on the new bounding region;…, and on the new bounding region; and e) segmenting the organ in the image again--, in [0022]-[0027]; {so that it is clearly disclosed that the result of step (e) segmenting the organ in the image again read on the claimed “one or more regenerated versions of the one or more segmentations generated by one or more neural networks”; and “the segmentation of the organ” in b) finding a new bounding region of the organ based on the segmentation of the organ read on the claimed “the one or more segmentations”; in other works, the claimed “the one or more segmentations” considered as the original segmentations, and segmentations of the ground truth in the training image data, and the claimed “regenerated segmentations” is the regenerated segmentations by the generative neural networks}; above Gazit’s disclosures of the one or more segmentations/boundaries of the training images are considered as input of one or more segmentations/boundaries of objects, being compared with computer generated output of one or more segmentations/boundaries of objects {organs} in representations of “the intensity gradient expected for the boundary of a target organ” and “the organ intensity characteristics based on the new bounding region”; as Gazit discloses: -- the intermediate results, such as a set of organ intensity characteristics determined for an input image, or image processing parameters to be used for processing an input image, or a segmentation mask of the lungs, and the final output, such as a smoothed image, or a segmented image, or a segmentation mask of a target organ--, in [0083]-[0084]; and, so that, Gazit’s “distribution of magnitudes of intensity gradient expected along a boundary of the organ”, and “intensity characteristics” and “estimated organ intensity characteristics” read on Application’s “label masks”, or “segmentation masks”, as the representation of the boundaries as the claimed limitation “one or more segmentations of one or more objects”, which is output by computer learning algorithm based on: --“because the intensity gradient expected for the boundary of a target organ will generally be higher for images of higher contrast.” …., and based on information on a distribution of intensities, in the target organ and/or its vicinity, is obtained first from each of a set of training images of that organ, including a variety of different organ intensity characteristics--, in [0096]-[0098]); comparing the one or more first segmentations to the output of the one or more neural networks (see Yang: e.g., Figs. 8-9; and, ---- utilize a trained deep image-to-image network to generate a liver segmentation mask from an input medical image of a patient. Embodiments of the present invention train the deep image-to-image network for liver segmentation in an adversarial network, in which the deep image-to-image network is trained together with a discriminator network that attempts to distinguish between ground truth liver segmentation masks and liver segmentation masks generated by the deep image-to-image network.--, in [0005]; and, --As shown in FIG. 8, image 800 is a CT image of a patient with pleural effusion, which brightens the lung region and changes the pattern of the upper boundary of the liver. This significantly increases the difficult for automatic liver segmentation, since in most CT volumes the lung looks dark with a low intensity. A test case shown in FIG. 8 usually corresponds with the largest error for a given method in Table 1. Images 805 and 810 show the ground truth liver boundary 801 and segmented liver boundary 802 generated using DI2IN-AN in different slices of the CT volume of the patient with pleural effusion. Although the DI2IN-AN segmentation has difficulty at the upper boundary, it still outperforms the other methods in this challenging test case. [0042] FIG. 9 illustrates exemplary liver segmentation results using the method of FIG. 1 in a CT volume of a patient with an enlarged liver. Another challenging case for automatic liver segmentation is a patient with an enlarged liver. As shown in FIG. 9, image 900 is a CT image of a patient with an enlarged liver. Images 905 and 910 show the ground truth liver boundary 901 and the segmented liver boundary 902 generated using DI2IN-AN segmentation.--, in [0041]-[0042]); determining the score for the one or more first segmentations, wherein the score is a function of differences between the one or more first segmentations and the one or more second segmentations (see Yang: e.g., --These operations are repeated k.sub.D times, and then the algorithm proceeds to the generator G training 504. In the generator G training, a mini-batch of training images x.about.p.sub.data are samples, a prediction y.sub.pred is generated for each training image x by the generator with the current generator weights G (x;.theta..sub.0.sup.G) and a classification score D(G(x')) is computed by the discriminator for each prediction, and updated generator weights .theta..sub.1.sup.G are learned to minimize the generator loss function l.sub.G by propagating back the stochastic gradient .gradient.l.sub.G (y.sub.gt',y.sub.pred'). These operations are repeated k.sub.G times. The alternating of the discriminator training 502 and generator training 504 is iterated for a plurality of iterations, until the discriminator is not able to easily distinguish between the ground truth label maps (ground truth liver segmentation masks) and the predictions output by the DI2IN (predicted liver segmentation masks)…. The trained generator (DI2IN) itself can be used during the inference stage (110) to provide high quality liver segmentation, with improved performance due to the adversarial training. --, in [0032]-[0033]). Re Claim 16, Gazit as modified by Yang further disclose wherein the neural network is a variational autoencoder that takes the one or more first segmentations as its input, wherein the variational autoencoder maps segmentation features of its input to a reduced feature space from which the first one or more segmentations can be approximately reproduced from segmentation features in the reduced feature space to generate the one or more second segmentations (see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]). Re Claim 17, Gazit as modified by Yang further disclose training the variational autoencoder with the collection of training segmentations, wherein the collection of training segmentations are represented by label masks that are ground truth label masks of training images previously determined to represent good segmentations of the images (see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; -- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image…. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region). The prediction 204 output by the DI2IN 200 for a given input 3D CT volume 202 can be output as a probability map or a binary liver segmentation mask.--, in [0024]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]; and, -- The present inventors collected more than 1000 CT volumes for training. The liver of each CT volume was delineated by human experts. These CT volumes cover large variations in population contrast phases, scanning ranges, pathologies, and field of view (FOV). The inter-slice distance varies from 0.5 mm to 0.7 mm. All of the scans cover the abdominal regions, but some may extend to the head and/or feet as well. Tumors can be found in multiple cases. Other diseases are present in the CT volumes as well. For example, pleural effusion, which brightens the lung region and changes the pattern of the upper boundary of the liver, is present in some of the scans.--, in [0038]; also see Gazil: e.g., --the mask generated in 818 is optionally morphologically improved, for example filling in holes in its interior that are completely surrounded by mask voxels, or filling in holes that are completely surrounded by mask voxels within a slice at a particular orientation, for example within an axial slice, or that are completely surrounded by mask voxels within a slice in any principal orientation (axial, coronal, or sagittal), or that are completely surrounded by mask voxels within a slice in either of two principal orientations, or in any two principal orientations, or in all three principal orientations. Such holes may result from noise in the image, as well as from small blood vessels, liver bile ducts, or other small features inside the target organ, that have a different intensity than most of the voxels of the target organ but are still considered part of the target organ.--, in [0177], and, --for segmenting or other image processing of components of organs, such as lobes of the lungs or the liver, heart valves and heart chambers, and blood vessels that belong to organs, the organ is optionally segmented first, and a bounding box of the component of the organ is optionally found relative to the bounding box of the organ, or relative to a salient feature of the organ.--, in [0242]). Re Claim 18, Gazit as modified by Yang further disclose training a segmenter to generate the one or more segmentations of the image by applying a collection of segment datasets to the segmenter, wherein each segment dataset of the collection of segment datasets comprises a training image and a corresponding training label mask that is a ground truth label mask of the training image previously been determined to represent a good segmentation of the training image (see Yang: e.g., -- For example, pleural effusion, which brightens the lung region and changes the pattern of the upper boundary of the liver, is present in some of the scans. An additional 50 CT volumes were collected from clinical sites for independent testing. The livers in these scans were also annotated by human experts for the purpose of evaluation. The dataset was down-sampled into 3.0 mm resolution isotropically to speed up the processing and lower the consumption of computer memory without loss of accuracy. In the adversarial training, .lamda. was set to 0.01, and the number of overall training iterations used was 100. For training the discriminator D, k.sub.D was 10 and the mini-batch size was 8. For training the DI2IN generator G, k.sub.G was 1 and the mini-batch size was 4. For calculating the segmentation loss, w.sub.i was set as 1….. Take DI2IN for example, training with 1000+ labelled data improves the mean ASD by 0.23 mm and the max ASD by 3.84 mm. Table 1 also shows that the adversarial training structure further boosts the performance of DI2IN. The maximum ASD error is also reduced using the DI2IN-AN. FIG. 6 shows exemplary liver segmentation results generated using DI2IN-AN. --, in [0038]-[0039]). Re Claim 19, Gazit as modified by Yang further disclose performing, using a first trained neural network of the one or more neural networks, a first segmentation process to output a first label mask in response to obtaining a representation of the one or more images (see Gazit: e.g., -- a) obtaining information about organ intensity characteristics for each of the training images; b) grouping the training images into clusters according to the information about organ intensity characteristics; and c) for each of a plurality of the clusters, finding representative information about the organ intensity characteristics for the training images in that cluster; wherein providing the sets of organ intensity characteristics comprises providing the representative information for the training images in the cluster that each set is generated from, for the sets that are generated from the training images.--, in [0016], and, -- Learning-based methods, based on knowledge encoded during an off-line training process, often involving manual or semi-manual ground-truth segmentations of training images, for example involving the locations of organs relative to a bounding box that may include the entire abdomen, or the entire body.--, in [0163]; also see Yang: e.g., Fig. 1, and, -- The trained deep image-to-image network is trained in an adversarial network together with a discriminative network that distinguishes between predicted liver segmentation masks generated by the deep image-to-image network from input training volumes and ground truth liver segmentation masks.--, in abstract, and, -- utilize a trained deep image-to-image network to generate a liver segmentation mask from an input medical image of a patient. Embodiments of the present invention train the deep image-to-image network for liver segmentation in an adversarial network, in which the deep image-to-image network is trained together with a discriminator network that attempts to distinguish between ground truth liver segmentation masks and liver segmentation masks generated by the deep image-to-image network.--, in [0005]; and, -- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image…. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region). The prediction 204 output by the DI2IN 200 for a given input 3D CT volume 202 can be output as a probability map or a binary liver segmentation mask.--, in [0024]). Re Claim 20, Gazit as modified by Yang further disclose performing, using a second trained neural network of the one or more neural networks, a shape evaluation process using a first label mask as an input and outputs a second label mask (see Yang: e.g., -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]). Re Claim 21, Gazit as modified by Yang further disclose mapping an input of the second trained neural network to a latent representation in a feature space where features in the feature space are shape features (see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]). Re Claim 22, Gazit as modified by Yang further disclose wherein the autoencoder is a variational autoencoder (see Gazit: e.g., -- a) obtaining information about organ intensity characteristics for each of the training images; b) grouping the training images into clusters according to the information about organ intensity characteristics; and c) for each of a plurality of the clusters, finding representative information about the organ intensity characteristics for the training images in that cluster; wherein providing the sets of organ intensity characteristics comprises providing the representative information for the training images in the cluster that each set is generated from, for the sets that are generated from the training images.--, in [0016], and, -- Learning-based methods, based on knowledge encoded during an off-line training process, often involving manual or semi-manual ground-truth segmentations of training images, for example involving the locations of organs relative to a bounding box that may include the entire abdomen, or the entire body.--, in [0163]; --the mask generated in 818 is optionally morphologically improved, for example filling in holes in its interior that are completely surrounded by mask voxels, or filling in holes that are completely surrounded by mask voxels within a slice at a particular orientation, for example within an axial slice, or that are completely surrounded by mask voxels within a slice in any principal orientation (axial, coronal, or sagittal), or that are completely surrounded by mask voxels within a slice in either of two principal orientations, or in any two principal orientations, or in all three principal orientations. Such holes may result from noise in the image, as well as from small blood vessels, liver bile ducts, or other small features inside the target organ, that have a different intensity than most of the voxels of the target organ but are still considered part of the target organ.--, in [0177], also see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]). Re Claims 23, and 26-29, claims 23, 26-29 are the corresponding method claim to claims 1-5 respectively. Thus, claims 23 and 26-29 are rejected for the similar reasons as for claims 1-5. Furthermore, Gazit as modified by and Yang further disclose a computer system comprising one or more processors and memory storing executable instructions that, as a result of being performed by the one or more processors, cause the computer system to perform the steps (see Yang: e.g., Fig. 6, and, -- a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth.--, in [0021]-[0024], {apparently, above “to distinguish” is to compare one or more boundaries of one or more shape features of one or more images, because “segmentation masks” are the boundaries of one or more shape features of one or more images}). Re Claim 24, Gazit as modified by Yang further disclose input the one or more segmentation as a VAE input to the VAE, wherein the one or more segmentations represents a processor-determined set of boundaries of objects depicted in the one or more images (see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, {apparently, above “to distinguish” is to compare one or more boundaries of one or more shape features of one or more images, because “segmentation masks” are the boundaries of one or more shape features of one or more images} in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]; and, -- The present inventors collected more than 1000 CT volumes for training. The liver of each CT volume was delineated by human experts. These CT volumes cover large variations in population contrast phases, scanning ranges, pathologies, and field of view (FOV). The inter-slice distance varies from 0.5 mm to 0.7 mm. All of the scans cover the abdominal regions, but some may extend to the head and/or feet as well. Tumors can be found in multiple cases. Other diseases are present in the CT volumes as well. For example, pleural effusion, which brightens the lung region and changes the pattern of the upper boundary of the liver, is present in some of the scans.--, in [0038]); compare the VAE input to a VAE output of the VAE (see Yang: e.g., Fig. 6, and, -- a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth.--, in [0021]-[0024], {apparently, above “to distinguish” is to compare one or more boundaries of one or more shape features of one or more images, because “segmentation masks” are the boundaries of one or more shape features of one or more images}); and determine a score for the one or more segmentations, wherein the score is a function of differences between the VAE input and the VAE output (see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]; and, -- The present inventors collected more than 1000 CT volumes for training. The liver of each CT volume was delineated by human experts. These CT volumes cover large variations in population contrast phases, scanning ranges, pathologies, and field of view (FOV). The inter-slice distance varies from 0.5 mm to 0.7 mm. All of the scans cover the abdominal regions, but some may extend to the head and/or feet as well. Tumors can be found in multiple cases. Other diseases are present in the CT volumes as well. For example, pleural effusion, which brightens the lung region and changes the pattern of the upper boundary of the liver, is present in some of the scans.--, in [0038]). Re Claim 25, Gazit as modified by Yang further disclose outputting the score; determining if the score is within a predetermined range; and outputting an alarm signal as the one or more notifications if the score is within the predetermined range (see Yang: e.g., ---- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image…. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region). The prediction 204 output by the DI2IN 200 for a given input 3D CT volume 202 can be output as a probability map or a binary liver segmentation mask.--, in [0024]). Re Claims 30-34, claims 30-34 are the corresponding medium claim to claims 1-5 respectively. Thus, claims 30-34 rejected for the similar reasons as for claims 1-5. Furthermore, Gazit as modified by and Yang further disclose a non-transitory computer system comprising one or more processors and memory storing executable instructions that, as a result of being performed by the one or more processors, cause the computer system to perform the steps (see Gazit: e.g., --any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device..--, in [0057]-[0059]). Re claim 35, Gazit as modified by Yang further disclose wherein the one or more regenerated versions of the one or more segmentations are generated by the one or more neural networks based, at least in part, on the one or more segmentations input (see Gazit: e.g., -- distribution of magnitudes of intensity gradient expected along a boundary of the organ, specific to the chosen set of organ intensity characteristics….. a) finding at least an approximate bounding region of the organ in the image, wherein the region of the image used for estimating organ intensity characteristics is the bounding region b) finding a new bounding region of the organ based on the segmentation of the organ; c) finding the organ intensity characteristics based on the new bounding region;…, and on the new bounding region; and e) segmenting the organ in the image again--, in [0022]-[0027]; it is clearly disclosed in above Gazit’s para. [0022]-[0027], “distribution of magnitudes of intensity gradient expected along a boundary of the organ”, and “intensity characteristics” and “estimated organ intensity characteristics”, which are applied in the comparison and cost function as a comparison operation are representations of boundaries of the organs, or the bounding region of the organ, such as the boundaries of a liver, which the boundaries, or bounding region are the “shape feature”, because these boundaries, or bounding region define the shape of the organ, see Gazit’s [0130], --“because the intensity gradient expected for the boundary of a target organ will generally be higher for images of higher contrast.” …., and based on information on a distribution of intensities, in the target organ and/or its vicinity, is obtained first from each of a set of training images of that organ, including a variety of different organ intensity characteristics--, in [0096]-[0098]; {above mentioned learning algorithm based on information on a distribution of intensities, in the target organ and/or its vicinity, is obtained first from each of a set of training images of that organ, including a variety of different organ intensity characteristics is carried out by neural networks, see Gazit’s [0165]; also see Yang: e.g., --“segmented liver boundary 802 generated using DI2IN-AN, in Fig. 8”, aligns to claimed “the one or more output boundaries are generated by the one or more neural networks”, herein “DI2IN-AN” read on Neural networks, see in Yang’s: -- the liver is segmented in the 3D medical image using the trained DI2IN. As described above, the DI2IN for liver segmentation is trained as the generator network of an adversarial network including the generator (DI2IN) and a discriminator network. In order to segment the liver in the received 3D medical image, the received 3D medical image is input to the trained DI2IN and the trained DI2IN generates a liver segmentation mask from the input 3D medical image…. the MICCAI-Sliver07 dataset only contains 20 CT volumes for training and 10 CT volumes for testing. All of the data are contrast enhanced. Such a small dataset is no suitable to show the power of CNN, as neural networks trained with more labelled data can usually achieve better performance. The present inventors collected more than 1000 CT volumes for training. The liver of each CT volume was delineated by human experts {referred as “ground truth liver boundary 801” as input, in Fig. 8}--, in [0035]-[0038]); Claims 6-12 is rejected under 35 U.S.C. 103 as being unpatentable over Gazit as modified by Yang, and further in view of Hsu (US 6404920 B1, as provided in IDS). Re Claim 6, Gazit as modified by Yang however do not disclose wherein the one or more segmentations include a first label mask representing boundaries of one or more objects in one or more images determined from a first segmentation process and the one or more regenerated versions of the one or more segmentations include a second label mask representing an output of a shape evaluation process into which the first label mask was an input, and wherein the indication logic is configured to compare the first label mask and the second label mask to determine the quality of the one or more segmentations, Hsu teaches the one or more segmentations include a first label mask representing boundaries of one or more objects in one or more images determined from a first segmentation process and the one or more regenerated versions of the one or more segmentations include a second label mask representing an output of a shape evaluation process into which the first label mask was an input, and wherein the indication logic is configured to compare the first label mask and the second label mask to determine the quality of the one or more segmentations (see Hsu: e.g., -- An edge is generally defined as the difference between adjacent pixels. Edge-based image segmentation is performed by generating an edge map and linking the edge pixels to form a closed contour.--, in lines40-43, col. 1; and, -- Object identification is a subsequent action after segmentation to label an object using commonly-accepted object names, such as a river, a forest or an M-60 tank…. Shape analysis is a subset of model-based approaches that requires extraction of object features from the boundary contour or a set of depth contours. --, in lines 40-56, col. 2; -- The principal use of boundary contour is to perform object recognition using shape information. The variables used to describe a given contour are called shape descriptors. Conventionally, researchers use Fourier descriptors and moments as shape descriptors. Another approach is to use a neural network to perform classification analysis by using binary silhouette-based images as input without having to extract feature attributes. While this boundary contour-based method is extremely effective in recognizing airplane types--, in lines 22-31, col. 20; and, -- Equation (2), above, is equivalent to using a mask created by the B8 to perform segmentation using the B0 data set. Thus, segmentation of B0 is entirely determined by B8. The generation of the binary is based on a simple thresholding principle, shown as Equation (1), above. Human intervention is minimal. (171) In general, the image B0_Seg has clutter attached at the bottom and both sides. Additional processing is thus required to create a clutter-free boundary contour. For this, an intelligent segmenter is used which is capable of merging neighboring pixels and subsequently performing region absorption based on size, shape and other criteria…. the next step is to merge the elongated region with the general background using a shape criterion…. a sample general penalty function-based segmentation method that is used to generate high-quality boundary contour--, in line 19, col. 21 through line 42, col. 22), Gazit (as modified by Yang) and Hsu are combinable as they are in the same field of endeavor: segmentation using neural network and boundaries identification and comparison. Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Gazit (as modified by Yang) ’s processor using Hsu’s teachings by including wherein the one or more segmentations include a first label mask representing boundaries of one or more objects in one or more images determined from a first segmentation process and the one or more regenerated versions of the one or more segmentations include a second label mask representing an output of a shape evaluation process into which the first label mask was an input, and wherein the indication logic is configured to compare the first label mask and the second label mask to determine the quality of the one or more segmentations to Gazit (as modified by Yang)’s indicates for which relatively narrow ranges of intensity a voxel in the bounding region is relatively most likely to be in the target organ, rather than outside the target organ in order to generate high-quality segmentation by generating a high quality boundary contour (see Hsu: e.g. in lines40-43, col. 1; and, in line 19, col. 21 through line 42, col. 22). Re Claim 7, Gazit as modified by Yang and Hsu further disclose wherein the one or more neural networks include a first trained neural network that performs the first segmentation process to output the first label mask in response to obtaining a representation of the image (see Gazit: e.g., -- a) obtaining information about organ intensity characteristics for each of the training images; b) grouping the training images into clusters according to the information about organ intensity characteristics; and c) for each of a plurality of the clusters, finding representative information about the organ intensity characteristics for the training images in that cluster; wherein providing the sets of organ intensity characteristics comprises providing the representative information for the training images in the cluster that each set is generated from, for the sets that are generated from the training images.--, in [0016], and, -- Learning-based methods, based on knowledge encoded during an off-line training process, often involving manual or semi-manual ground-truth segmentations of training images, for example involving the locations of organs relative to a bounding box that may include the entire abdomen, or the entire body.--, in [0163]; also see Yang: e.g., Fig. 1, and, -- The trained deep image-to-image network is trained in an adversarial network together with a discriminative network that distinguishes between predicted liver segmentation masks generated by the deep image-to-image network from input training volumes and ground truth liver segmentation masks.--, in abstract, and, -- utilize a trained deep image-to-image network to generate a liver segmentation mask from an input medical image of a patient. Embodiments of the present invention train the deep image-to-image network for liver segmentation in an adversarial network, in which the deep image-to-image network is trained together with a discriminator network that attempts to distinguish between ground truth liver segmentation masks and liver segmentation masks generated by the deep image-to-image network.--, in [0005]; and, -- a deep image-to-image network (DI2IN) for liver segmentation is pre-trained based on the training samples in a first training phase. The DI2IN is a multi-layer convolutional neural network (CNN) trained to perform liver segmentation in an input 3D medical image…. The segmentation task performed by the DI2IN 200 is defined as the voxel-wise binary classification of an input 3D medical image. As shown in FIG. 2, the DI2IN 200 takes an entire 3D CT volume 202 as input, and outputs a probability map that indicates the probability/likelihood of voxels belonging to the liver region. It is straightforward to covert such a probability map to a binary liver segmentation mask by labeling all voxels with a probability score greater than a threshold (e.g., 0.5) as positive (in the liver region) and all voxels with a probability score less than the threshold as negative (not in the liver region). The prediction 204 output by the DI2IN 200 for a given input 3D CT volume 202 can be output as a probability map or a binary liver segmentation mask.--, in [0024]). Re Claim 8, Gazit as modified by Yang and Hsu further disclose wherein the one or more neural networks include a second trained neural network that performs the shape evaluation process using the first label mask as its input and outputs the second label mask (see Yang: e.g., -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]). Re Claim 9, Gazit as modified by Yang and Hsu further disclose wherein the second trained neural network is an autoencoder with an internal layer that maps its input to a latent representation in a feature space where features in the feature space are shape features (see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]). Re Claim 10, Gazit as modified by Yang and Hsu further disclose wherein the autoencoder is a variational autoencoder (see Gazit: e.g., -- a) obtaining information about organ intensity characteristics for each of the training images; b) grouping the training images into clusters according to the information about organ intensity characteristics; and c) for each of a plurality of the clusters, finding representative information about the organ intensity characteristics for the training images in that cluster; wherein providing the sets of organ intensity characteristics comprises providing the representative information for the training images in the cluster that each set is generated from, for the sets that are generated from the training images.--, in [0016], and, -- Learning-based methods, based on knowledge encoded during an off-line training process, often involving manual or semi-manual ground-truth segmentations of training images, for example involving the locations of organs relative to a bounding box that may include the entire abdomen, or the entire body.--, in [0163]; --the mask generated in 818 is optionally morphologically improved, for example filling in holes in its interior that are completely surrounded by mask voxels, or filling in holes that are completely surrounded by mask voxels within a slice at a particular orientation, for example within an axial slice, or that are completely surrounded by mask voxels within a slice in any principal orientation (axial, coronal, or sagittal), or that are completely surrounded by mask voxels within a slice in either of two principal orientations, or in any two principal orientations, or in all three principal orientations. Such holes may result from noise in the image, as well as from small blood vessels, liver bile ducts, or other small features inside the target organ, that have a different intensity than most of the voxels of the target organ but are still considered part of the target organ.--, in [0177], also see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]). Re Claim 11, Gazit as modified by Yang and Hsu further disclose comprising logic for training the second trained neural network using a training subcollection of segment datasets, wherein a segment dataset of the training subcollection comprises a training image and a corresponding training label mask (see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]; and, -- The present inventors collected more than 1000 CT volumes for training. The liver of each CT volume was delineated by human experts. These CT volumes cover large variations in population contrast phases, scanning ranges, pathologies, and field of view (FOV). The inter-slice distance varies from 0.5 mm to 0.7 mm. All of the scans cover the abdominal regions, but some may extend to the head and/or feet as well. Tumors can be found in multiple cases. Other diseases are present in the CT volumes as well. For example, pleural effusion, which brightens the lung region and changes the pattern of the upper boundary of the liver, is present in some of the scans.--, in [0038]). Re Claim 12, Gazit as modified by Yang and Hsu further disclose wherein the logic for training the second trained neural network further uses a validation subcollection of segment datasets (see Yang: e.g., --According to an embodiment of the present invention, a deep image-to-image network (DI2IN) that produces liver segmentation masks from input 3D medical images acts as the generator and is trained together with a discriminator that attempts to distinguish between ground truth liver segmentation mask training samples and liver segmentation masks generated by the DI2IN from input medical images. In an advantageous embodiment, the DI2IN employs a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. In training, the DI2IN-AN attempts to optimize a multi-class cross-entropy loss together with an adversarial term that aims to distinguish between the output of the DI2IN and the ground truth. Advantageously, the discriminator pushes the generator's output towards the distribution of ground truth, and thus enhances the generator's performance by refining its output during training. Since the discriminator can be implemented using a CNN which takes the joint configuration of many input variables, the discriminator embeds higher-order potentials in the adversarial network.--, in [0021]; and, --In the encoder part (BLK 1-BLK 4) of the DI2IN 200 only convolutional layers are used in all of the blocks. In order to increase the receptive field of neurons and lower the GPU memory consumption, the stride is set as 2 at some layers in the encoder and the size of the feature maps is reduced at each of those layers. Moreover, a larger receptive field covers more contextual information and helps to preserve liver shape information in the prediction.--, in [0025]; and, -- the DI2IN is trained together with a discriminator network in adversarial network in order to boost the performance of the DI2IN. FIG. 3 illustrates an adversarial network according to an embodiment of the present invention. As shown in FIG. 3, the adversarial network includes a generator 300 and a discriminator 310. According to an advantageous embodiment, the generator 300 is the DI2IN for liver segmentation. For example, the generator 300 can be the DI2IN 200 having the network structure shown in FIG. 2. The discriminator 310 is a deep neural network that attempts to distinguish between ground truth liver segmentation masks and predicted liver segmentation masks generated by the generator 300 (DI2IN) from training images. The adversarial network is utilized in training to capture high-order appearance information, which distinguishes between the ground truth and output from the DI2IN. During training, the generator 300 inputs training CT volumes 302 and generates predictions 304 (i.e., predicted liver segmentation masks or probability maps) from the input training CT volumes 302. The discriminator 310 inputs ground truth liver segmentation masks 306 and the predictions 304 generated by the generator 300, and classifies these images as real/ground truth (positive) or fake/prediction (negative).--, in [0028]; and, -- The present inventors collected more than 1000 CT volumes for training. The liver of each CT volume was delineated by human experts. These CT volumes cover large variations in population contrast phases, scanning ranges, pathologies, and field of view (FOV). The inter-slice distance varies from 0.5 mm to 0.7 mm. All of the scans cover the abdominal regions, but some may extend to the head and/or feet as well. Tumors can be found in multiple cases. Other diseases are present in the CT volumes as well. For example, pleural effusion, which brightens the lung region and changes the pattern of the upper boundary of the liver, is present in some of the scans.--, in [0038]). Conclusion Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to WEI WEN YANG whose telephone number is (571)270-5670. The examiner can normally be reached on 8:00 - 5:00 pm. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached on 571-272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /WEI WEN YANG/Primary Examiner, Art Unit 2662
Read full office action

Prosecution Timeline

Show 2 earlier events
Apr 25, 2024
Response Filed
Jul 24, 2024
Final Rejection mailed — §103
Jan 30, 2025
Response after Non-Final Action
Aug 22, 2025
Request for Continued Examination
Aug 26, 2025
Response after Non-Final Action
Oct 08, 2025
Non-Final Rejection mailed — §103
Mar 09, 2026
Response Filed
Apr 15, 2026
Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12639934
ROBUSTNESS MEASUREMENT DEVICE, ROBUSTNESS MEASUREMENT METHOD, AND STORAGE MEDIUM
2y 9m to grant Granted May 26, 2026
Patent 12633135
METHOD AND SYSTEM FOR TOPOLOGY DETECTION
3y 9m to grant Granted May 19, 2026
Patent 12626340
SYSTEMS, METHODS, AND APPARATUSES FOR IMPLEMENTING SELF-SUPERVISED VISUAL REPRESENTATION LEARNING USING ORDER AND APPEARANCE RECOVERY ON A VISION TRANSFORMER
3y 1m to grant Granted May 12, 2026
Patent 12622618
WRIST-WORN IMPAIRMENT DETECTION AND METHODS FOR USING SUCH
2y 3m to grant Granted May 12, 2026
Patent 12620484
MACHINE-LEARNING TECHNIQUES FOR OXYGEN THERAPY PREDICTION USING MEDICAL IMAGING DATA AND CLINICAL METADATA
5y 8m to grant Granted May 05, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

5-6
Expected OA Rounds
82%
Grant Probability
93%
With Interview (+10.9%)
2y 5m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 666 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month