DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.
Information Disclosure Statement
The information disclosure statement(s) (IDS) submitted on 08/27/2024 has been considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 3-8, 10-11 and 13-21 are rejected under 35 U.S.C. 103 as being unpatentable over GAO et al. (US 20200357096 A1), hereinafter referenced as GAO in view of HIASA et al. (US 20220076071 A1), hereinafter referenced as HIASA and in further view of ICHINO et al. (US 20230360440 A1), hereinafter referenced as ICHINO.
Regarding claim 1, GAO explicitly teaches a GAN-based super-resolution image processing method (Fig. 6. Paragraph [0050]-GAO discloses a deep residual network is built under the generative adversarial network (GAN) framework to estimate the primitive super-resolution image I.sup.SR (the latent structure features) from the time-series of low-resolution fluorescent images 114), comprising:
obtaining a positive sample image, a negative sample image, and a reference sample image, wherein the positive sample image is a ground-truth super-resolution image corresponding to an input sample image (Fig. 1. Paragraph [0031]-GAO discloses the Simulation module 110 is shown in FIG. 1 as receiving as input a high-resolution image 112 and generating as output plural, simulated, noisy, low-resolution images 114. The input high resolution-image 114 may be obtained from an existing collection of images, may be generated with a fluorescence microscope, or may be obtained in any other possible way. The input high-resolution image 114 needs to show various structures (called herein fluorophores) with enough clarity so that the Deep Learning module can be trained), the negative sample image is an image obtained by performing fusion and noise addition on the input sample image and the positive sample image (Fig. 5. Paragraph [0048]-GAO discloses a flowchart of a method for generating the low-resolution images 114. The method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, based on the first image 112, a step 506 of adding DC background to the time-series plurality of second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating a time-series, low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein the noise added may be gaussian noise). In paragraph [0032]-GAO discloses the Simulation module 114 is designed to generate ground-truth high-resolution images that will be used by the Deep Learning module 120 for training (i.e., the Simulation module would generate ground-truth high-resolution images). Please also read paragraph [0036 and 0046]), and the reference sample image is an image output after the input sample image is processed to reduce image quality by a generative model of a generative adversarial network (GAN) to be trained (Fig. 5. Paragraph [0055]-GAO discloses the input to the Deep Learning module 120, for the training mode 140, is the time-series low-resolution images 114 generated by the Simulation module 110 (wherein Deep Learning module 120 contains a generative adversarial network (GAN) with a generator model G and a discriminator model D, and Generator G is composed of a residual network module 612 and multiscale upsampling component 614). In paragraph [0059]-GAO discloses the multiscale upsampling component 614 is composed of several pixel shuffle layers 730, 732 and 734 and plural convolutional layers 740 and 742. Using these layers, the model is able to process 2×, 4×, and 8× super-resolution images 750, 752, and 754. In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. Thus, during training, the Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model. Please also read paragraph [0048-0052 and 0062-0066]);
GAO fails to explicitly teach extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image and determining a binary cross entropy (BCE) loss function based on the first score and the second score.
However, HIASA explicitly teaches extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image (Fig. 4. Paragraph [0039]-HIASA discloses in step S101, the acquisition unit 112 acquires one or more sets of a high resolution image and a low resolution image. The low resolution image may be generated by downsampling the high resolution image (wherein the low resolution image may be degraded by adding compressive noise and later upsampled to match the size of the high resolution image). In paragraph [0040]-HIASA discloses in step S102, the calculation unit 113 inputs the low resolution image to the generator to generate the first and second intermediate high resolution images. In paragraph [0055]-HIASA discloses a sum of the MSE of the first intermediate high resolution image 205 and the high resolution image and the MSE of the second intermediate high resolution image 206 and the high resolution image is used as the loss function, and the weight of the generator is updated by backpropagation (wherein mean absolute error (MAE) may be used in place of Mean squared error)) by using a discriminative model of the GAN (Fig. 4. Paragraph [0032]-HIASA discloses the image processing system 100 includes a learning apparatus 101, a resolution enhancement apparatus 102, and a control apparatus 103 (wherein the system contains a GAN, relativistic GAN and/or super resolution GAN)), separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image and determining a binary cross entropy (BCE) loss function based on the first score and the second score (Fig. 4. Paragraph [0056]-HIASA discloses in step S104, the update unit 114 determines whether the first learning is completed (wherein second learning may begin if first learning is complete). In paragraph [0059]-HIASA discloses in step S107, the calculation unit 113 inputs the second intermediate high resolution image 206 and the high resolution image individually to the discriminator to generate respective discrimination outputs. Further in paragraph [0060]-HIASA discloses in step S108, the update unit 114 updates the weight of the discriminator based on the discrimination output and a ground truth label. The ground truth label with respect to the second intermediate high resolution image 206 is 0, and the ground truth label with respect to the actual high resolution image is 1. Sigmoid cross entropy is used as the loss function. In paragraph [0061]-HIASA discloses in step S109, the update unit 114 updates the weight of the generator based on the first loss and the second loss. The second loss is the sigmoid cross entropy between the discrimination output obtained by inputting the second intermediate high resolution image 206 to the discriminator and the ground truth label, which is 1).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO of having a GAN-based super-resolution image processing method, comprising: obtaining a positive sample image, a negative sample image, and a reference sample image, with the teachings of HIASA of having extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image and determining a loss function based on the first score and the second score.
Wherein GAO’s method having extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image and determining a loss function based on the first score and the second score.
The motivation behind the modification would have been to obtain a method that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and HIASA concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while HIASA’s systems and methods improves the performance of generator and discriminator models and the resolution enhancement of images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and HIASA et al. (US 20220076071 A1), Abstract and Paragraph [0075-0080].
GAO fails to explicitly teach extracting a fourth feature corresponding to the positive sample image, a fifth feature corresponding to the negative sample image, and a sixth feature corresponding to the reference sample image by using a preset network, and determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature, wherein the second contrastive learning loss function is used for enabling a feature of the reference sample image to be close to a feature of the positive sample image and far away from a feature of the negative sample image; and training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image.
However, ICHINO explicitly teaches extracting a fourth feature corresponding to the positive sample image, a fifth feature corresponding to the negative sample image, and a sixth feature corresponding to the reference sample image by using a preset network (Fig. 3. Paragraph [0096]-ICHINO discloses FIG. 4 is a diagram showing an example of a processing procedure for performing a learning process of the feature conversion unit 105 in the feature conversion learning device 200. In paragraph [0101]-ICHINO discloses in the process of the loop L11, the image reduction unit 202 reduces a high-resolution image of the labeled image (step S221). The image reduction unit 202 reduces the image by thinning out the pixels of the image. The reduced image corresponds to a low-resolution image. In paragraph [0102]-ICHINO discloses the image enlargement unit 103 enlarges the reduced image (step S222) (wherein the enlarged image is referred to as a deteriorated image). In paragraph [0103]-ICHINO discloses the feature extraction unit 104 extracts a feature of the deteriorated image (step S223). In paragraph [0104]-ICHINO discloses the feature conversion unit 105 converts the feature of the deteriorated image (step S224) (wherein a feature after conversion is referred to as a super-resolution feature). In paragraph [0105]-ICHINO discloses the feature extraction unit 104 extracts a feature of the high-resolution image of the labeled image (step S231). In paragraph [0108]-ICHINO discloses after the end of the loop L11, the feature conversion learning device 200 calculates a loss for a learning process of the feature conversion unit 105 (step S242)), and determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature, wherein the second contrastive learning loss function (Fig. 4. Paragraph [0109]-ICHINO discloses FIG. 5 is a diagram showing a processing procedure in which the feature conversion learning device 200 calculates a loss. In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The adversarial loss is a loss whose value decreases when the discriminator makes erroneous determination (wherein the adversarial loss may use a binary cross entropy loss). In paragraph [0117]-ICHINO discloses the reconstruction loss calculation unit 204 calculates a reconstruction loss that decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases (step S261). In paragraph [0118]-ICHINO discloses the reconstruction loss calculation unit 204 may be configured to calculate the reconstruction loss so that the reconstruction loss decreases as a distance L2 associated with the feature of the super-resolution feature and the feature of the high-resolution image decreases (wherein the reconstruction loss may also use an L1 distance). In paragraph [0119]-ICHINO discloses the similarity loss calculation unit 207 calculates the similarity loss using the class label (wherein the similarity loss may also be a contrastive loss function)) is used for enabling a feature of the reference sample image to be close to a feature of the positive sample image and far away from a feature of the negative sample image (Fig. 4. Paragraph [0117]-ICHINO discloses the learning control unit 208 adjusts the machine learning model parameter value of the feature conversion unit 105 so that the super-resolution feature is closer to the feature of the high-resolution image. In paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the total loss value decreases as the other losses decrease (i.e. adversarial, reconstruction and similarity)). Therefore, it would have been obvious to one of ordinary skill in the art to extract a second feature from a deteriorated image that has been generated through the well-known techniques of fusion and noise injection. ICHINO discloses contrastive loss functions based on multiple losses and/or loss functions as well as features extracted from high-resolution ground truth images, low-resolution/deteriorated images and converted super-resolution images. Moreover, the distances between features of the ground truth and the deteriorated/super resolution images are greater due to degradation/enlargement, and the loss functions are designed to minimize the distances between ground truth and super resolution images. Thus, it would have been obvious to use fusion and noise to obtain the image. This would have improved the generation of image samples and enhanced the training of the learning model. Please also read paragraph [0111, 0117-0119, and 0143]);
and training parameters of the generative model (Fig. 4. Paragraph [0110]-ICHINO discloses in the process of FIG. 5, the feature discrimination unit 205 outputs a vector for determining (discriminating) whether the input feature is a super-resolution feature or a feature of a high-resolution image (step S251). In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The feature conversion unit 105 and the feature discrimination unit 205 have a structure of a generative adversarial network (GAN) and the adversarial loss calculation unit 206 calculates the adversarial loss using the vector output by the feature discrimination unit 205) by performing backpropagation (Fig. 4. Paragraph [0110]-ICHINO discloses [0128]-ICHINO discloses after the process of FIG. 5 is completed, the learning control unit 208 calculates a gradient of a parameter of the neural network of the feature conversion unit 105 using an error backpropagation method (backpropagation) (step S243). In paragraph [0130]-ICHINO discloses the learning control unit 208 optimizes the parameter value so that the loss function value is minimized. In paragraph [0131]-ICHINO discloses the feature conversion learning device 200 calculates a loss for a learning process of the feature discrimination unit 205 (step S245). In paragraph [0132]-ICHINO discloses the learning control unit 208 calculates the gradient of the parameter of the neural network of the feature discrimination unit 205 using the error backpropagation method (step S246)) based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image (Fig. 4. Paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the adversarial loss is uses a binary cross entropy loss, the similarity loss may be configured to use a contrastive loss, and the reconstruction loss decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases and may use either L1 or L2 distance)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA of having a GAN-based super-resolution image processing method, comprising: obtaining a positive sample image, a negative sample image, and a reference sample image, with the teachings of ICHINO of having a binary cross entropy (BCE) loss function; and training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image.
Wherein GAO’s method having extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image, and determining a binary cross entropy (BCE) loss function based on the first score and the second score; extracting a fourth feature corresponding to the positive sample image, a fifth feature corresponding to the negative sample image, and a sixth feature corresponding to the reference sample image by using a preset network, and determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature, wherein the second contrastive learning loss function is used for enabling a feature of the reference sample image to be close to a feature of the positive sample image and far away from a feature of the negative sample image; and training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image.
The motivation behind the modification would have been to obtain a method that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and ICHINO concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while ICHINO’s systems and methods improves the accuracy of authentication processes and the ability to produce super resolution images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and ICHINO et al. (US 20230360440 A1), and Paragraph [0043, 0066 and 0081-0092].
Regarding claim 3, GAO in view of HIASA and in further view of ICHINO explicitly teaches the method according to claim 1, GAO further teaches wherein the determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature comprises:
determining a fourth loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]) based on the fourth feature and the sixth feature (Fig. 6. Paragraph [0048]-GAO discloses the method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, a step 506 of adding DC background to the second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein in training mode, the low-resolution images 114 generated by the Simulation module 110 are fed to Generator G, which is composed of a residual network module and multiscale upsampling component). In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. The Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model and simultaneously to ensure that the dimensionality of the images increases smoothly and gradually without introducing too much fake detail. Please also read paragraph [0068-0073]);
determining a fifth loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]) based on the fifth feature and the sixth feature (Fig. 6. Paragraph [0048]-GAO discloses the method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, a step 506 of adding DC background to the second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein in training mode, the low-resolution images 114 generated by the Simulation module 110 are fed to Generator G, which is composed of a residual network module and multiscale upsampling component). In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. The Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model and simultaneously to ensure that the dimensionality of the images increases smoothly and gradually without introducing too much fake detail. Please also read paragraph [0068-0073]); and
determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]).
Regarding claim 4, GAO in view of HIASA and in further view of ICHINO explicitly teaches the method according to claim 3, GAO fails to explicitly teach wherein the determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function comprises: calculating a ratio of the fourth loss function to the fifth loss function to obtain the second contrastive learning loss function, wherein the fourth loss function is an L1 loss function representing a mean absolute error between the fourth feature and the sixth feature, and the fifth loss function is an L1 loss function representing a mean absolute error between the fifth feature and the sixth feature.
However, HIASA explicitly teaches wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function (Fig. 1. Paragraph [0032]-HIASA discloses the image processing system 100 includes a learning apparatus 101, a resolution enhancement apparatus 102, and a control apparatus 103. In paragraph [0023]-HIASA discloses in the present exemplary embodiment, a generator which is a machine learning model converts a low resolution image into a feature map (first feature map) and generates, from the first feature map, two intermediate images (a first intermediate image and a second intermediate image) having higher resolution than that of the low resolution image. In paragraph [0024]-HIASA discloses the generator is trained by using different loss functions for the two intermediate high resolution images. The loss functions include a first loss based on a difference between the intermediate high resolution image and a high resolution image which is ground truth (a ground truth image), and a second loss which is defined based on a discrimination output from a discriminator which discriminates whether an input image is an image generated by the generator. Please also read paragraph [0025-0026, 0054-0061 and 0079-0080]) comprises:
calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function (Fig. 4. Paragraph [0054]-HIASA discloses in step S103, the update unit 114 updates the weight of the generator based on the first loss. The first loss is a loss defined based on a difference between the high resolution image (ground truth image) corresponding to the low resolution image 201 and the intermediate high resolution image. Mean squared error (MSE) is used, but mean absolute error (MAE) or the like may be used. In paragraph [0055]-HIASA discloses a sum of the MSE of the first intermediate high resolution image 205 and the high resolution image and the MSE of the second intermediate high resolution image 206 and the high resolution image is used as the loss function, and the weight of the generator is updated by backpropagation (wherein the update unit 114 determines whether the first learning is completed by determining whether an amount of change in the weight at the time of update is smaller than a predetermined value)), wherein the first loss function is an L1 loss function representing a mean absolute error between the second feature and the third feature (Fig. 4. Paragraph [0039]-HIASA discloses in step S101, the acquisition unit 112 acquires one or more sets of a high resolution image and a low resolution image from the storage unit 111. The low resolution image and the high resolution image corresponding to each other include the same object. The low resolution image may be generated by downsampling the high resolution image (wherein the low resolution image may be degraded by adding compressive noise and later unsampled to match the size of the high resolution image). The generator can be provided with a function of correcting image degradation in addition to a resolution enhancement function. In paragraph [0040]-HIASA discloses in step S102, the calculation unit 113 inputs the low resolution image to the generator to generate the first and second intermediate high resolution images. In paragraph [0052]-HIASA discloses the first residual component 203 is summed with the low resolution image 201, and a first intermediate high resolution image 205 is generated. The second residual component 204 is summed with the low resolution image 201, and a second intermediate high resolution image 206 is generated), and the second loss function is an L1 loss function representing a mean absolute error between the first feature and the third feature (Fig. 4. Paragraph [0059]-HIASA discloses in step S107, the calculation unit 113 inputs the second intermediate high resolution image 206 and the high resolution image individually to the discriminator to generate respective discrimination outputs. The discriminator discriminates whether the input image is the high resolution image generated by the generator or an actual high resolution image. In paragraph [0061]-HIASA discloses in step S109, the update unit 114 updates the weight of the generator based on the first loss and the second loss. Only the first loss is calculated with respect to the first intermediate high resolution image 205. A weighted sum of the first loss and the second loss is calculated with respect to the second intermediate high resolution image 206. The second loss is the sigmoid cross entropy between the discrimination output obtained by inputting the second intermediate high resolution image 206 to the discriminator and the ground truth. A sum of the losses of the first intermediate high resolution image 205 and the second intermediate high resolution image 206 is regarded as the loss function of the generator (wherein a third loss may also be used). Therefore, it would have been obvious to one of ordinary skill to specifically calculate a ratio of each L1 loss given HIASA updates the generator and discriminator based on weighted sums and/or comparisons of L1 and cross entropy losses. This may further improve the learning processes of the model. Please also read paragraph [0079-0080]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having a GAN-based super-resolution image processing method, comprising: obtaining a positive sample image, a negative sample image, and a reference sample image, with the teachings of HIASA of having wherein the determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function comprises: calculating a ratio of the fourth loss function to the fifth loss function to obtain the second contrastive learning loss function, wherein the fourth loss function is an L1 loss function representing a mean absolute error between the fourth feature and the sixth feature, and the fifth loss function is an L1 loss function representing a mean absolute error between the fifth feature and the sixth feature.
Wherein GAO’s method having wherein the determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function comprises: calculating a ratio of the fourth loss function to the fifth loss function to obtain the second contrastive learning loss function, wherein the fourth loss function is an L1 loss function representing a mean absolute error between the fourth feature and the sixth feature, and the fifth loss function is an L1 loss function representing a mean absolute error between the fifth feature and the sixth feature.
The motivation behind the modification would have been to obtain a method that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and HIASA concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while HIASA’s systems and methods improves the performance of generator and discriminator models and the resolution enhancement of images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and HIASA et al. (US 20220076071 A1), Abstract and Paragraph [0075-0080].
Regarding claim 5, GAO in view of HIASA and in further view of ICHINO explicitly teaches the method according to claim 1, GAO fails to explicitly teach extracting a second feature corresponding to the negative sample image by using the discriminative model of the GAN, and determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature, wherein the first contrastive learning loss function is used for enabling the feature of the reference sample image to be close to the feature of the negative sample image and far away from the feature of the positive sample image; and the training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super- resolution network comprises: training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network.
However, ICHINO explicitly teaches extracting a second feature corresponding to the negative sample image by using the discriminative model of the GAN (Fig. 3. Paragraph [0096]-ICHINO discloses FIG. 4 is a diagram showing an example of a processing procedure for performing a learning process of the feature conversion unit 105 in the feature conversion learning device 200. In paragraph [0101]-ICHINO discloses in the process of the loop L11, the image reduction unit 202 reduces a high-resolution image of the labeled image (step S221). The image reduction unit 202 reduces the image by thinning out the pixels of the image. The reduced image corresponds to a low-resolution image. In paragraph [0102]-ICHINO discloses the image enlargement unit 103 enlarges the reduced image (step S222) (wherein the enlarged image is referred to as a deteriorated image). In paragraph [0103]-ICHINO discloses the feature extraction unit 104 extracts a feature of the deteriorated image (step S223). In paragraph [0104]-ICHINO discloses the feature conversion unit 105 converts the feature of the deteriorated image (step S224) (wherein a feature after conversion is referred to as a super-resolution feature). In paragraph [0105]-ICHINO discloses the feature extraction unit 104 extracts a feature of the high-resolution image of the labeled image (step S231). In paragraph [0108]-ICHINO discloses after the end of the loop L11, the feature conversion learning device 200 calculates a loss for a learning process of the feature conversion unit 105 (step S242)), and determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature (Fig. 4. Paragraph [0109]-ICHINO discloses FIG. 5 is a diagram showing a processing procedure in which the feature conversion learning device 200 calculates a loss. In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The adversarial loss is a loss whose value decreases when the discriminator makes erroneous determination (wherein the adversarial loss may use a binary cross entropy loss). In paragraph [0117]-ICHINO discloses the reconstruction loss calculation unit 204 calculates a reconstruction loss that decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases (step S261). In paragraph [0118]-ICHINO discloses the reconstruction loss calculation unit 204 may be configured to calculate the reconstruction loss so that the reconstruction loss decreases as a distance L2 associated with the feature of the super-resolution feature and the feature of the high-resolution image decreases (wherein the reconstruction loss may also use an L1 distance). In paragraph [0119]-ICHINO discloses the similarity loss calculation unit 207 calculates the similarity loss using the class label (wherein the similarity loss may also be a contrastive loss function)), wherein the first contrastive learning loss function is used for enabling the feature of the reference sample image to be close to the feature of the negative sample image and far away from the feature of the positive sample image (Fig. 4. Paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the total loss value decreases as the other losses decrease (i.e. adversarial, reconstruction and similarity)). Therefore, it would have been obvious to one of ordinary skill in the art to use additive noise to degrade the negative image and for a contrastive learning function to enable a feature of a reference sample image to be far away from a negative sample image given the loss functions and image similarity are based on distances between features, the features of the deteriorated images are dissimilar to the original image and the loss functions are designed to increase the similarity between the converted super resolution deteriorated images and the original images. This would have enhanced the precision of the learning model’s training and further reinforced the principles behind the loss functions. Please also read paragraph [0111, 0117-0119, and 0143]); and
the training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function (Fig. 4. Paragraph [0110]-ICHINO discloses in the process of FIG. 5, the feature discrimination unit 205 outputs a vector for determining (discriminating) whether the input feature is a super-resolution feature or a feature of a high-resolution image (step S251). In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The feature conversion unit 105 and the feature discrimination unit 205 have a structure of a generative adversarial network (GAN) and the adversarial loss calculation unit 206 calculates the adversarial loss using the vector output by the feature discrimination unit 205), comprises:
training the parameters of the generative model by performing backpropagation (Fig. 4. Paragraph [0110]-ICHINO discloses [0128]-ICHINO discloses after the process of FIG. 5 is completed, the learning control unit 208 calculates a gradient of a parameter of the neural network of the feature conversion unit 105 using an error backpropagation method (backpropagation) (step S243). In paragraph [0130]-ICHINO discloses the learning control unit 208 optimizes the parameter value so that the loss function value is minimized. In paragraph [0131]-ICHINO discloses the feature conversion learning device 200 calculates a loss for a learning process of the feature discrimination unit 205 (step S245). In paragraph [0132]-ICHINO discloses the learning control unit 208 calculates the gradient of the parameter of the neural network of the feature discrimination unit 205 using the error backpropagation method (step S246)) based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function (Fig. 4. Paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the adversarial loss is uses a binary cross entropy loss, the similarity loss may be configured to use a contrastive loss, and the reconstruction loss decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases and may use either L1 or L2 distance)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA of having a GAN-based super-resolution image processing method, comprising: obtaining a positive sample image, a negative sample image, and a reference sample image, with the teachings of ICHINO of having extracting a second feature corresponding to the negative sample image by using the discriminative model of the GAN, and determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature, wherein the first contrastive learning loss function is used for enabling the feature of the reference sample image to be close to the feature of the negative sample image and far away from the feature of the positive sample image; and the training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super- resolution network comprises: training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network.
Wherein GAO’s method having extracting a second feature corresponding to the negative sample image by using the discriminative model of the GAN, and determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature, wherein the first contrastive learning loss function is used for enabling the feature of the reference sample image to be close to the feature of the negative sample image and far away from the feature of the positive sample image; and the training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super- resolution network comprises: training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network.
The motivation behind the modification would have been to obtain a method that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and ICHINO concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while ICHINO’s systems and methods improves the accuracy of authentication processes and the ability to produce super resolution images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and ICHINO et al. (US 20230360440 A1), and Paragraph [0043, 0066 and 0081-0092].
Regarding claim 7, GAO in view of HIASA and in further view of ICHINO explicitly teaches the method according to claim 6, GAO fails to explicitly teach wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function comprises: calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function, wherein the first loss function is an L1 loss function representing a mean absolute error between the second feature and the third feature, and the second loss function is an L1 loss function representing a mean absolute error between the first feature and the third feature.
However, HIASA explicitly teaches wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function (Fig. 1. Paragraph [0032]-HIASA discloses the image processing system 100 includes a learning apparatus 101, a resolution enhancement apparatus 102, and a control apparatus 103. In paragraph [0023]-HIASA discloses in the present exemplary embodiment, a generator which is a machine learning model converts a low resolution image into a feature map (first feature map) and generates, from the first feature map, two intermediate images (a first intermediate image and a second intermediate image) having higher resolution than that of the low resolution image. In paragraph [0024]-HIASA discloses the generator is trained by using different loss functions for the two intermediate high resolution images. The loss functions include a first loss based on a difference between the intermediate high resolution image and a high resolution image which is ground truth (a ground truth image), and a second loss which is defined based on a discrimination output from a discriminator which discriminates whether an input image is an image generated by the generator. Please also read paragraph [0025-0026, 0054-0061 and 0079-0080]) comprises:
calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function, wherein the first loss function is an L1 loss function representing a mean absolute error (Fig. 4. Paragraph [0054]-HIASA discloses in step S103, the update unit 114 updates the weight of the generator based on the first loss. The first loss is a loss defined based on a difference between the high resolution image (ground truth image) corresponding to the low resolution image 201 and the intermediate high resolution image. Mean squared error (MSE) is used, but mean absolute error (MAE) or the like may be used. In paragraph [0055]-HIASA discloses a sum of the MSE of the first intermediate high resolution image 205 and the high resolution image and the MSE of the second intermediate high resolution image 206 and the high resolution image is used as the loss function, and the weight of the generator is updated by backpropagation (wherein the update unit 114 determines whether the first learning is completed by determining whether an amount of change in the weight at the time of update is smaller than a predetermined value)) between the second feature and the third feature (Fig. 4. Paragraph [0039]-HIASA discloses in step S101, the acquisition unit 112 acquires one or more sets of a high resolution image and a low resolution image from the storage unit 111. The low resolution image and the high resolution image corresponding to each other include the same object. The low resolution image may be generated by downsampling the high resolution image (wherein the low resolution image may be degraded by adding compressive noise and later unsampled to match the size of the high resolution image). The generator can be provided with a function of correcting image degradation in addition to a resolution enhancement function. In paragraph [0040]-HIASA discloses in step S102, the calculation unit 113 inputs the low resolution image to the generator to generate the first and second intermediate high resolution images. In paragraph [0052]-HIASA discloses the first residual component 203 is summed with the low resolution image 201, and a first intermediate high resolution image 205 is generated. The second residual component 204 is summed with the low resolution image 201, and a second intermediate high resolution image 206 is generated), and the second loss function is an L1 loss function representing a mean absolute error between the first feature and the third feature (Fig. 4. Paragraph [0059]-HIASA discloses in step S107, the calculation unit 113 inputs the second intermediate high resolution image 206 and the high resolution image individually to the discriminator to generate respective discrimination outputs. The discriminator discriminates whether the input image is the high resolution image generated by the generator or an actual high resolution image. In paragraph [0061]-HIASA discloses in step S109, the update unit 114 updates the weight of the generator based on the first loss and the second loss. Only the first loss is calculated with respect to the first intermediate high resolution image 205. A weighted sum of the first loss and the second loss is calculated with respect to the second intermediate high resolution image 206. The second loss is the sigmoid cross entropy between the discrimination output obtained by inputting the second intermediate high resolution image 206 to the discriminator and the ground truth. A sum of the losses of the first intermediate high resolution image 205 and the second intermediate high resolution image 206 is regarded as the loss function of the generator (wherein a third loss may also be used). Therefore, it would have been obvious to one of ordinary skill to specifically calculate a ratio of each L1 loss given HIASA updates the generator and discriminator based on weighted sums and/or comparisons of L1 and cross entropy losses. This may further improve the learning processes of the model. Please also read paragraph [0079-0080]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having a GAN-based super-resolution image processing method, comprising: obtaining a positive sample image, a negative sample image, and a reference sample image, with the teachings of HIASA of having wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function comprises: calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function, wherein the first loss function is an L1 loss function representing a mean absolute error between the second feature and the third feature, and the second loss function is an L1 loss function representing a mean absolute error between the first feature and the third feature.
Wherein GAO’s method having wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function comprises: calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function, wherein the first loss function is an L1 loss function representing a mean absolute error between the second feature and the third feature, and the second loss function is an L1 loss function representing a mean absolute error between the first feature and the third feature.
The motivation behind the modification would have been to obtain a method that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and HIASA concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while HIASA’s systems and methods improves the performance of generator and discriminator models and the resolution enhancement of images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and HIASA et al. (US 20220076071 A1), Abstract and Paragraph [0075-0080].
Regarding claim 8, GAO in view of HIASA and in further view of ICHINO explicitly teaches the method according to claim 5, GAO further teaches wherein the method further comprises:
determining a third loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]) based on the reference sample image and the positive sample image (Fig. 6. Paragraph [0048]-GAO discloses the method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, a step 506 of adding DC background to the second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein in training mode, the low-resolution images 114 generated by the Simulation module 110 are fed to Generator G, which is composed of a residual network module and multiscale upsampling component). In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. The Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model and simultaneously to ensure that the dimensionality of the images increases smoothly and gradually without introducing too much fake detail. Please also read paragraph [0068-0073]); and
GAO fails to explicitly teach and the training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network comprises: training the parameters of the generative model by performing backpropagation based on the BCE loss function, the third loss function, the second contrastive learning loss function, and the first contrastive learning loss function, to obtain the target super-resolution network.
However, ICHINO explicitly teaches and the training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network (Fig. 4. Paragraph [0110]-ICHINO discloses in the process of FIG. 5, the feature discrimination unit 205 outputs a vector for determining (discriminating) whether the input feature is a super-resolution feature or a feature of a high-resolution image (step S251). In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The feature conversion unit 105 and the feature discrimination unit 205 have a structure of a generative adversarial network (GAN) and the adversarial loss calculation unit 206 calculates the adversarial loss using the vector output by the feature discrimination unit 205) comprises:
training the parameters of the generative model by performing backpropagation (Fig. 4. Paragraph [0110]-ICHINO discloses [0128]-ICHINO discloses after the process of FIG. 5 is completed, the learning control unit 208 calculates a gradient of a parameter of the neural network of the feature conversion unit 105 using an error backpropagation method (backpropagation) (step S243). In paragraph [0130]-ICHINO discloses the learning control unit 208 optimizes the parameter value so that the loss function value is minimized. In paragraph [0131]-ICHINO discloses the feature conversion learning device 200 calculates a loss for a learning process of the feature discrimination unit 205 (step S245). In paragraph [0132]-ICHINO discloses the learning control unit 208 calculates the gradient of the parameter of the neural network of the feature discrimination unit 205 using the error backpropagation method (step S246)) based on the BCE loss function, the third loss function, the second contrastive learning loss function, and the first contrastive learning loss function, to obtain the target super-resolution network (Fig. 4. Paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the adversarial loss is uses a binary cross entropy loss, the similarity loss may be configured to use a contrastive loss, and the reconstruction loss decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases and may use either L1 or L2 distance)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA of having a GAN-based super-resolution image processing method, comprising: obtaining a positive sample image, a negative sample image, and a reference sample image, with the teachings of ICHINO of having and the training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network comprises: training the parameters of the generative model by performing backpropagation based on the BCE loss function, the third loss function, the second contrastive learning loss function, and the first contrastive learning loss function, to obtain the target super-resolution network.
Wherein GAO’s method having and the training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network comprises: training the parameters of the generative model by performing backpropagation based on the BCE loss function, the third loss function, the second contrastive learning loss function, and the first contrastive learning loss function, to obtain the target super-resolution network.
The motivation behind the modification would have been to obtain a method that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and ICHINO concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while ICHINO’s systems and methods improves the accuracy of authentication processes and the ability to produce super resolution images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and ICHINO et al. (US 20230360440 A1), and Paragraph [0043, 0066 and 0081-0092].
Regarding claim 10, GAO explicitly teaches an electronic device (Fig. 11, #1100 called a computing device. Paragraph [0089]. Please also see Fig. 1 and read paragraph [0030]), comprising:
a processor (Fig. 11, #1102 called a processor. Paragraph [0089]); and
a memory configured to store instructions executable by the processor, wherein the processor is configured to read the executable instructions from the memory, and execute the instructions to implement a GAN-based super-resolution image processing method (Fig. 11. Paragraph [0089]-GAO discloses computing device 1100 suitable for performing the activities described in the embodiments may include a server 1101. Such a server 1101 may include a central processor (CPU) 1102 coupled to a random access memory (RAM) 1104 and to a read-only memory (ROM) 1106. ROM 1106 may also be other types of storage media to store programs. Please also see Fig. 1 and read paragraph [0030]), the GAN-based super-resolution image processing method (Fig. 6. Paragraph [0050]-GAO discloses a deep residual network is built under the generative adversarial network (GAN) framework to estimate the primitive super-resolution image I.sup.SR (the latent structure features) from the time-series of low-resolution fluorescent images 114) comprises:
obtaining a positive sample image, a negative sample image, and a reference sample image, wherein the positive sample image is a ground-truth super-resolution image corresponding to an input sample image (Fig. 1. Paragraph [0031]-GAO discloses the Simulation module 110 is shown in FIG. 1 as receiving as input a high-resolution image 112 and generating as output plural, simulated, noisy, low-resolution images 114. The input high resolution-image 114 may be obtained from an existing collection of images, may be generated with a fluorescence microscope, or may be obtained in any other possible way. The input high-resolution image 114 needs to show various structures (called herein fluorophores) with enough clarity so that the Deep Learning module can be trained), the negative sample image is an image obtained by performing fusion and noise addition on the input sample image and the positive sample image (Fig. 5. Paragraph [0048]-GAO discloses a flowchart of a method for generating the low-resolution images 114. The method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, based on the first image 112, a step 506 of adding DC background to the time-series plurality of second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating a time-series, low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein the noise added may be gaussian noise). In paragraph [0032]-GAO discloses the Simulation module 114 is designed herein to generate ground-truth high-resolution images that will be used by the Deep Learning module 120 for training (i.e., the Simulation module would generate ground-truth high-resolution images). Please also read paragraph [0036 and 0046]), and the reference sample image is an image output after the input sample image is processed to reduce image quality by a generative model of a generative adversarial network (GAN) to be trained (Fig. 5. Paragraph [0055]-GAO discloses the input to the Deep Learning module 120, for the training mode 140, is the time-series low-resolution images 114 generated by the Simulation module 110 (wherein Deep Learning module 120 contains a generative adversarial network (GAN) with a generator model G and a discriminator model D and Generator G is composed of a residual network module 612 and the multiscale upsampling component 614). In paragraph [0059]-GAO discloses the multiscale upsampling component 614 is composed of several pixel shuffle layers 730, 732 and 734 and plural convolutional layers 740 and 742. Using these layers, the model is able to process 2×, 4×, and 8× super-resolution images 750, 752, and 754. In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. Thus, during training, the Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model);
GAO fails to explicitly teach extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image, and determining a binary cross entropy (BCE) loss function based on the first score and the second score;
However, HIASA explicitly teaches extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image (Fig. 4. Paragraph [0039]-HIASA discloses in step S101, the acquisition unit 112 acquires one or more sets of a high resolution image and a low resolution image. The low resolution image may be generated by downsampling the high resolution image (wherein the low resolution image may be degraded by adding compressive noise and later upsampled to match the size of the high resolution image). In paragraph [0040]-HIASA discloses in step S102, the calculation unit 113 inputs the low resolution image to the generator to generate the first and second intermediate high resolution images. In paragraph [0055]-HIASA discloses a sum of the MSE of the first intermediate high resolution image 205 and the high resolution image and the MSE of the second intermediate high resolution image 206 and the high resolution image is used as the loss function, and the weight of the generator is updated by backpropagation (wherein mean absolute error (MAE) may be used in place of Mean squared error)) by using a discriminative model of the GAN (Fig. 4. Paragraph [0032]-HIASA discloses the image processing system 100 includes a learning apparatus 101, a resolution enhancement apparatus 102, and a control apparatus 103 (wherein the system contains a GAN, relativistic GAN and/or super resolution GAN)), separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image, and determining a binary cross entropy (BCE) loss function based on the first score and the second score (Fig. 4. Paragraph [0056]-HIASA discloses in step S104, the update unit 114 determines whether the first learning is completed (wherein second learning may begin if first learning is complete). In paragraph [0059]-HIASA discloses in step S107, the calculation unit 113 inputs the second intermediate high resolution image 206 and the high resolution image individually to the discriminator to generate respective discrimination outputs. Further in paragraph [0060]-HIASA discloses in step S108, the update unit 114 updates the weight of the discriminator based on the discrimination output and a ground truth label. The ground truth label with respect to the second intermediate high resolution image 206 is 0, and the ground truth label with respect to the actual high resolution image is 1. Sigmoid cross entropy is used as the loss function. In paragraph [0061]-HIASA discloses in step S109, the update unit 114 updates the weight of the generator based on the first loss and the second loss. The second loss is the sigmoid cross entropy between the discrimination output obtained by inputting the second intermediate high resolution image 206 to the discriminator and the ground truth label, which is 1);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO of having an electronic device, comprising: a processor; and a memory configured to store instructions executable by the processor, wherein the processor is configured to read the executable instructions from the memory, and execute the instructions to implement a GAN-based super-resolution image processing, with the teachings of HIASA of having extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image and determining a loss function based on the first score and the second score.
Wherein GAO’s electronic device having extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image and determining a loss function based on the first score and the second score.
The motivation behind the modification would have been to obtain an electronic device that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and HIASA concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while HIASA’s systems and methods improves the performance of generator and discriminator models and the resolution enhancement of images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and HIASA et al. (US 20220076071 A1), Abstract and Paragraph [0075-0080].
GAO fails to explicitly teach extracting a fourth feature corresponding to the positive sample image, a fifth feature corresponding to the negative sample image, and a sixth feature corresponding to the reference sample image by using a preset network, and determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature, wherein the second contrastive learning loss function is used for enabling a feature of the reference sample image to be close to a feature of the positive sample image and far away from a feature of the negative sample image; and
training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super- resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image.
However, ICHINO explicitly teaches extracting a fourth feature corresponding to the positive sample image, a fifth feature corresponding to the negative sample image, and a sixth feature corresponding to the reference sample image by using a preset network (Fig. 3. Paragraph [0096]-ICHINO discloses FIG. 4 is a diagram showing an example of a processing procedure for performing a learning process of the feature conversion unit 105 in the feature conversion learning device 200. In paragraph [0101]-ICHINO discloses in the process of the loop L11, the image reduction unit 202 reduces a high-resolution image of the labeled image (step S221). The image reduction unit 202 reduces the image by thinning out the pixels of the image. The reduced image corresponds to a low-resolution image. In paragraph [0102]-ICHINO discloses the image enlargement unit 103 enlarges the reduced image (step S222) (wherein the enlarged image is referred to as a deteriorated image). In paragraph [0103]-ICHINO discloses the feature extraction unit 104 extracts a feature of the deteriorated image (step S223). In paragraph [0104]-ICHINO discloses the feature conversion unit 105 converts the feature of the deteriorated image (step S224) (wherein a feature after conversion is referred to as a super-resolution feature). In paragraph [0105]-ICHINO discloses the feature extraction unit 104 extracts a feature of the high-resolution image of the labeled image (step S231). In paragraph [0108]-ICHINO discloses after the end of the loop L11, the feature conversion learning device 200 calculates a loss for a learning process of the feature conversion unit 105 (step S242)), and determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature (Fig. 4. Paragraph [0109]-ICHINO discloses FIG. 5 is a diagram showing a processing procedure in which the feature conversion learning device 200 calculates a loss. In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The adversarial loss is a loss whose value decreases when the discriminator makes erroneous determination (wherein the adversarial loss may use a binary cross entropy loss). In paragraph [0117]-ICHINO discloses the reconstruction loss calculation unit 204 calculates a reconstruction loss that decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases (step S261). In paragraph [0118]-ICHINO discloses the reconstruction loss calculation unit 204 may be configured to calculate the reconstruction loss so that the reconstruction loss decreases as a distance L2 associated with the feature of the super-resolution feature and the feature of the high-resolution image decreases (wherein the reconstruction loss may also use an L1 distance). In paragraph [0119]-ICHINO discloses the similarity loss calculation unit 207 calculates the similarity loss using the class label (wherein the similarity loss may also be a contrastive loss function)), wherein the second contrastive learning loss function is used for enabling a feature of the reference sample image to be close to a feature of the positive sample image and far away from a feature of the negative sample image (Fig. 4. Paragraph [0117]-ICHINO discloses the learning control unit 208 adjusts the machine learning model parameter value of the feature conversion unit 105 so that the super-resolution feature is closer to the feature of the high-resolution image. In paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the total loss value decreases as the other losses decrease (i.e. adversarial, reconstruction and similarity)). Therefore, it would have been obvious to one of ordinary skill in the art to extract a second feature from a deteriorated image that has been generated through the well-known techniques of fusion and noise injection. ICHINO discloses contrastive loss functions based on multiple losses and/or loss functions as well as features extracted from high-resolution ground truth images, low-resolution/deteriorated images and converted super-resolution images. Moreover, the distances between features of the ground truth and the deteriorated/super resolution images are greater due to degradation/enlargement, and the loss functions are designed to minimize the distances between ground truth and super resolution images. Thus, it would have been obvious to use fusion and noise to obtain the image. This would have improved the generation of image samples and enhanced the training of the learning model. Please also read paragraph [0111, 0117-0119, and 0143]); and
training parameters of the generative model (Fig. 4. Paragraph [0110]-ICHINO discloses in the process of FIG. 5, the feature discrimination unit 205 outputs a vector for determining (discriminating) whether the input feature is a super-resolution feature or a feature of a high-resolution image (step S251). In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The feature conversion unit 105 and the feature discrimination unit 205 have a structure of a generative adversarial network (GAN) and the adversarial loss calculation unit 206 calculates the adversarial loss using the vector output by the feature discrimination unit 205) by performing backpropagation (Fig. 4. Paragraph [0110]-ICHINO discloses [0128]-ICHINO discloses after the process of FIG. 5 is completed, the learning control unit 208 calculates a gradient of a parameter of the neural network of the feature conversion unit 105 using an error backpropagation method (backpropagation) (step S243). In paragraph [0130]-ICHINO discloses the learning control unit 208 optimizes the parameter value so that the loss function value is minimized. In paragraph [0131]-ICHINO discloses the feature conversion learning device 200 calculates a loss for a learning process of the feature discrimination unit 205 (step S245). In paragraph [0132]-ICHINO discloses the learning control unit 208 calculates the gradient of the parameter of the neural network of the feature discrimination unit 205 using the error backpropagation method (step S246)) based on the BCE loss function and the second contrastive learning loss function, to obtain a target super- resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image (Fig. 4. Paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the adversarial loss is uses a binary cross entropy loss, the similarity loss may be configured to use a contrastive loss, and the reconstruction loss decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases and may use either L1 or L2 distance)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA of having a an electronic device, comprising: a processor; and a memory configured to store instructions executable by the processor, wherein the processor is configured to read the executable instructions from the memory, and execute the instructions to implement a GAN-based super-resolution image processing, with the teachings of ICHINO of having a binary cross entropy (BCE) loss function; and training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image.
Wherein GAO’s electronic device having extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image, and determining a binary cross entropy (BCE) loss function based on the first score and the second score; extracting a fourth feature corresponding to the positive sample image, a fifth feature corresponding to the negative sample image, and a sixth feature corresponding to the reference sample image by using a preset network, and determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature, wherein the second contrastive learning loss function is used for enabling a feature of the reference sample image to be close to a feature of the positive sample image and far away from a feature of the negative sample image; and training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image.
The motivation behind the modification would have been to obtain an electronic device that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and ICHINO concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while ICHINO’s systems and methods improves the accuracy of authentication processes and the ability to produce super resolution images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and ICHINO et al. (US 20230360440 A1), and Paragraph [0043, 0066 and 0081-0092].
Regarding claim 11, GAO explicitly teaches a non-transitory computer-readable storage medium having stored thereon a computer program (Fig. 11. Paragraph [0089]-GAO discloses computing device 1100 suitable for performing the activities described in the embodiments may include a server 1101. Such a server 1101 may include a central processor (CPU) 1102 coupled to a random access memory (RAM) 1104 and to a read-only memory (ROM) 1106. ROM 1106 may also be other types of storage media to store programs) for performing a GAN-based super-resolution image processing method (Fig. 11. Paragraph [0030]-GAO discloses the method may be implemented in a computing system 100, as illustrated in FIG. 1, that includes two modules, the Simulation module 110 and the Deep Learning module 120 (wherein computing system contains a GAN for super resolution processing)), the GAN-based super-resolution image processing method (Fig. 6. Paragraph [0050]-GAO discloses a deep residual network is built under the generative adversarial network (GAN) framework to estimate the primitive super-resolution image I.sup.SR (the latent structure features) from the time-series of low-resolution fluorescent images 114) comprising:
obtaining a positive sample image, a negative sample image, and a reference sample image, wherein the positive sample image is a ground-truth super-resolution image corresponding to an input sample image (Fig. 1. Paragraph [0031]-GAO discloses the Simulation module 110 is shown in FIG. 1 as receiving as input a high-resolution image 112 and generating as output plural, simulated, noisy, low-resolution images 114. The input high resolution-image 114 may be obtained from an existing collection of images, may be generated with a fluorescence microscope, or may be obtained in any other possible way. The input high-resolution image 114 needs to show various structures (called herein fluorophores) with enough clarity so that the Deep Learning module can be trained), the negative sample image is an image obtained by performing fusion and noise addition on the input sample image and the positive sample image (Fig. 5. Paragraph [0048]-GAO discloses a flowchart of a method for generating the low-resolution images 114. The method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, based on the first image 112, a step 506 of adding DC background to the time-series plurality of second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating a time-series, low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein the noise added may be gaussian noise). In paragraph [0032]-GAO discloses the Simulation module 114 is designed herein to generate ground-truth high-resolution images that will be used by the Deep Learning module 120 for training (i.e., the Simulation module would generate ground-truth high-resolution images). Please also read paragraph [0036 and 0046]), and the reference sample image is an image output after the input sample image is processed to reduce image quality by a generative model of a generative adversarial network (GAN) to be trained (Fig. 5. Paragraph [0055]-GAO discloses the input to the Deep Learning module 120, for the training mode 140, is the time-series low-resolution images 114 generated by the Simulation module 110 (wherein Deep Learning module 120 contains a generative adversarial network (GAN) with a generator model G and a discriminator model D and Generator G is composed of a residual network module 612 and the multiscale upsampling component 614). In paragraph [0059]-GAO discloses the multiscale upsampling component 614 is composed of several pixel shuffle layers 730, 732 and 734 and plural convolutional layers 740 and 742. Using these layers, the model is able to process 2×, 4×, and 8× super-resolution images 750, 752, and 754. In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. Thus, during training, the Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model. Please also read paragraph [0048-0052 and 0062-0066]);
GAO fails to explicitly teach extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image, and determining a binary cross entropy (BCE) loss function based on the first score and the second score;
However, HIASA explicitly teaches extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image (Fig. 4. Paragraph [0039]-HIASA discloses in step S101, the acquisition unit 112 acquires one or more sets of a high resolution image and a low resolution image. The low resolution image may be generated by downsampling the high resolution image (wherein the low resolution image may be degraded by adding compressive noise and later upsampled to match the size of the high resolution image). In paragraph [0040]-HIASA discloses in step S102, the calculation unit 113 inputs the low resolution image to the generator to generate the first and second intermediate high resolution images. In paragraph [0055]-HIASA discloses a sum of the MSE of the first intermediate high resolution image 205 and the high resolution image and the MSE of the second intermediate high resolution image 206 and the high resolution image is used as the loss function, and the weight of the generator is updated by backpropagation (wherein mean absolute error (MAE) may be used in place of Mean squared error)) by using a discriminative model of the GAN (Fig. 4. Paragraph [0032]-HIASA discloses the image processing system 100 includes a learning apparatus 101, a resolution enhancement apparatus 102, and a control apparatus 103 (wherein the system contains a GAN, relativistic GAN and/or super resolution GAN)), separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image, and determining a binary cross entropy (BCE) loss function based on the first score and the second score (Fig. 4. Paragraph [0056]-HIASA discloses in step S104, the update unit 114 determines whether the first learning is completed (wherein second learning may begin if first learning is complete). In paragraph [0059]-HIASA discloses in step S107, the calculation unit 113 inputs the second intermediate high resolution image 206 and the high resolution image individually to the discriminator to generate respective discrimination outputs. Further in paragraph [0060]-HIASA discloses in step S108, the update unit 114 updates the weight of the discriminator based on the discrimination output and a ground truth label. The ground truth label with respect to the second intermediate high resolution image 206 is 0, and the ground truth label with respect to the actual high resolution image is 1. Sigmoid cross entropy is used as the loss function. In paragraph [0061]-HIASA discloses in step S109, the update unit 114 updates the weight of the generator based on the first loss and the second loss. The second loss is the sigmoid cross entropy between the discrimination output obtained by inputting the second intermediate high resolution image 206 to the discriminator and the ground truth label, which is 1);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO of having a non-transitory computer-readable storage medium having stored thereon a computer program for performing a GAN-based super-resolution image processing method, with the teachings of HIASA of having extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image and determining a loss function based on the first score and the second score.
Wherein GAO’s a non-transitory computer-readable storage medium having extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image and determining a loss function based on the first score and the second score.
The motivation behind the modification would have been to obtain a non-transitory computer-readable storage medium that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and HIASA concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while HIASA’s systems and methods improves the performance of generator and discriminator models and the resolution enhancement of images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and HIASA et al. (US 20220076071 A1), Abstract and Paragraph [0075-0080].
GAO fails to explicitly teach extracting a fourth feature corresponding to the positive sample image, a fifth feature corresponding to the negative sample image, and a sixth feature corresponding to the reference sample image by using a preset network, and determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature, wherein the second contrastive learning loss function is used for enabling a feature of the reference sample image to be close to a feature of the positive sample image and far away from a feature of the negative sample image; and training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image.
However, ICHINO explicitly teaches extracting a fourth feature corresponding to the positive sample image, a fifth feature corresponding to the negative sample image, and a sixth feature corresponding to the reference sample image by using a preset network (Fig. 3. Paragraph [0096]-ICHINO discloses FIG. 4 is a diagram showing an example of a processing procedure for performing a learning process of the feature conversion unit 105 in the feature conversion learning device 200. In paragraph [0101]-ICHINO discloses in the process of the loop L11, the image reduction unit 202 reduces a high-resolution image of the labeled image (step S221). The image reduction unit 202 reduces the image by thinning out the pixels of the image. The reduced image corresponds to a low-resolution image. In paragraph [0102]-ICHINO discloses the image enlargement unit 103 enlarges the reduced image (step S222) (wherein the enlarged image is referred to as a deteriorated image). In paragraph [0103]-ICHINO discloses the feature extraction unit 104 extracts a feature of the deteriorated image (step S223). In paragraph [0104]-ICHINO discloses the feature conversion unit 105 converts the feature of the deteriorated image (step S224) (wherein a feature after conversion is referred to as a super-resolution feature). In paragraph [0105]-ICHINO discloses the feature extraction unit 104 extracts a feature of the high-resolution image of the labeled image (step S231). In paragraph [0108]-ICHINO discloses after the end of the loop L11, the feature conversion learning device 200 calculates a loss for a learning process of the feature conversion unit 105 (step S242)), and determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature (Fig. 4. Paragraph [0109]-ICHINO discloses FIG. 5 is a diagram showing a processing procedure in which the feature conversion learning device 200 calculates a loss. In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The adversarial loss is a loss whose value decreases when the discriminator makes erroneous determination (wherein the adversarial loss may use a binary cross entropy loss). In paragraph [0117]-ICHINO discloses the reconstruction loss calculation unit 204 calculates a reconstruction loss that decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases (step S261). In paragraph [0118]-ICHINO discloses the reconstruction loss calculation unit 204 may be configured to calculate the reconstruction loss so that the reconstruction loss decreases as a distance L2 associated with the feature of the super-resolution feature and the feature of the high-resolution image decreases (wherein the reconstruction loss may also use an L1 distance). In paragraph [0119]-ICHINO discloses the similarity loss calculation unit 207 calculates the similarity loss using the class label (wherein the similarity loss may also be a contrastive loss function)), wherein the second contrastive learning loss function is used for enabling a feature of the reference sample image to be close to a feature of the positive sample image and far away from a feature of the negative sample image (Fig. 4. Paragraph [0117]-ICHINO discloses the learning control unit 208 adjusts the machine learning model parameter value of the feature conversion unit 105 so that the super-resolution feature is closer to the feature of the high-resolution image. In paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the total loss value decreases as the other losses decrease (i.e. adversarial, reconstruction and similarity)). Therefore, it would have been obvious to one of ordinary skill in the art to extract a second feature from a deteriorated image that has been generated through the well-known techniques of fusion and noise injection. ICHINO discloses contrastive loss functions based on multiple losses and/or loss functions as well as features extracted from high-resolution ground truth images, low-resolution/deteriorated images and converted super-resolution images. Moreover, the distances between features of the ground truth and the deteriorated/super resolution images are greater due to degradation/enlargement, and the loss functions are designed to minimize the distances between ground truth and super resolution images. Thus, it would have been obvious to use fusion and noise to obtain the image. This would have improved the generation of image samples and enhanced the training of the learning model. Please also read paragraph [0111, 0117-0119, and 0143]); and
training parameters of the generative model (Fig. 4. Paragraph [0110]-ICHINO discloses in the process of FIG. 5, the feature discrimination unit 205 outputs a vector for determining (discriminating) whether the input feature is a super-resolution feature or a feature of a high-resolution image (step S251). In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The feature conversion unit 105 and the feature discrimination unit 205 have a structure of a generative adversarial network (GAN) and the adversarial loss calculation unit 206 calculates the adversarial loss using the vector output by the feature discrimination unit 205) by performing backpropagation (Fig. 4. Paragraph [0110]-ICHINO discloses [0128]-ICHINO discloses after the process of FIG. 5 is completed, the learning control unit 208 calculates a gradient of a parameter of the neural network of the feature conversion unit 105 using an error backpropagation method (backpropagation) (step S243). In paragraph [0130]-ICHINO discloses the learning control unit 208 optimizes the parameter value so that the loss function value is minimized. In paragraph [0131]-ICHINO discloses the feature conversion learning device 200 calculates a loss for a learning process of the feature discrimination unit 205 (step S245). In paragraph [0132]-ICHINO discloses the learning control unit 208 calculates the gradient of the parameter of the neural network of the feature discrimination unit 205 using the error backpropagation method (step S246)) based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image (Fig. 4. Paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the adversarial loss is uses a binary cross entropy loss, the similarity loss may be configured to use a contrastive loss, and the reconstruction loss decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases and may use either L1 or L2 distance)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA of having a non-transitory computer-readable storage medium having stored thereon a computer program for performing a GAN-based super-resolution image processing method, with the teachings of ICHINO of having a binary cross entropy (BCE) loss function; and training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image.
Wherein GAO’s a non-transitory computer-readable storage medium having extracting a first feature corresponding to the positive sample image and a third feature corresponding to the reference sample image by using a discriminative model of the GAN, separately performing discrimination on the first feature and the third feature to obtain a first score corresponding to the positive sample image and a second score corresponding to the reference sample image, and determining a binary cross entropy (BCE) loss function based on the first score and the second score; extracting a fourth feature corresponding to the positive sample image, a fifth feature corresponding to the negative sample image, and a sixth feature corresponding to the reference sample image by using a preset network, and determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature, wherein the second contrastive learning loss function is used for enabling a feature of the reference sample image to be close to a feature of the positive sample image and far away from a feature of the negative sample image; and training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super-resolution network, so that super-resolution processing is performed on a test image based on the target super-resolution network to obtain a target super-resolution image.
The motivation behind the modification would have been to obtain a non-transitory computer-readable storage medium that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and ICHINO concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while ICHINO’s systems and methods improves the accuracy of authentication processes and the ability to produce super resolution images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and ICHINO et al. (US 20230360440 A1), and Paragraph [0043, 0066 and 0081-0092].
Regarding claim 13, GAO in view of HIASA and in further view of ICHINO explicitly teaches the electronic device according to claim 10, GAO further teaches wherein the determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature comprises:
determining a fourth loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]) based on the fourth feature and the sixth feature (Fig. 6. Paragraph [0048]-GAO discloses the method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, a step 506 of adding DC background to the second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein in training mode, the low-resolution images 114 generated by the Simulation module 110 are fed to Generator G, which is composed of a residual network module and multiscale upsampling component). In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. The Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model and simultaneously to ensure that the dimensionality of the images increases smoothly and gradually without introducing too much fake detail. Please also read paragraph [0068-0073]);
determining a fifth loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]) based on the fifth feature and the sixth feature (Fig. 6. Paragraph [0048]-GAO discloses the method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, a step 506 of adding DC background to the second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein in training mode, the low-resolution images 114 generated by the Simulation module 110 are fed to Generator G, which is composed of a residual network module and multiscale upsampling component). In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. The Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model and simultaneously to ensure that the dimensionality of the images increases smoothly and gradually without introducing too much fake detail. Please also read paragraph [0068-0073]); and
determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]).
Regarding claim 14, GAO in view of HIASA and in further view of ICHINO explicitly teaches the electronic device according to claim 13, GAO fails to explicitly teach wherein the determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function comprises: calculating a ratio of the fourth loss function to the fifth loss function to obtain the second contrastive learning loss function, wherein the fourth loss function is an L1 loss function representing a mean absolute error between the fourth feature and the sixth feature, and the fifth loss function is an L1 loss function representing a mean absolute error between the fifth feature and the sixth feature.
However, HIASA explicitly teaches wherein the determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function (Fig. 1. Paragraph [0032]-HIASA discloses the image processing system 100 includes a learning apparatus 101, a resolution enhancement apparatus 102, and a control apparatus 103. In paragraph [0023]-HIASA discloses in the present exemplary embodiment, a generator which is a machine learning model converts a low resolution image into a feature map (first feature map) and generates, from the first feature map, two intermediate images (a first intermediate image and a second intermediate image) having higher resolution than that of the low resolution image. In paragraph [0024]-HIASA discloses the generator is trained by using different loss functions for the two intermediate high resolution images. The loss functions include a first loss based on a difference between the intermediate high resolution image and a high resolution image which is ground truth (a ground truth image), and a second loss which is defined based on a discrimination output from a discriminator which discriminates whether an input image is an image generated by the generator. Please also read paragraph [0025-0026, 0054-0061 and 0079-0080]) comprises:
calculating a ratio of the fourth loss function to the fifth loss function to obtain the second contrastive learning loss function (Fig. 4. Paragraph [0054]-HIASA discloses in step S103, the update unit 114 updates the weight of the generator based on the first loss. The first loss is a loss defined based on a difference between the high resolution image (ground truth image) corresponding to the low resolution image 201 and the intermediate high resolution image. Mean squared error (MSE) is used, but mean absolute error (MAE) or the like may be used. In paragraph [0055]-HIASA discloses a sum of the MSE of the first intermediate high resolution image 205 and the high resolution image and the MSE of the second intermediate high resolution image 206 and the high resolution image is used as the loss function, and the weight of the generator is updated by backpropagation (wherein the update unit 114 determines whether the first learning is completed by determining whether an amount of change in the weight at the time of update is smaller than a predetermined value)), wherein the fourth loss function is an L1 loss function representing a mean absolute error between the fourth feature and the sixth feature (Fig. 4. Paragraph [0039]-HIASA discloses in step S101, the acquisition unit 112 acquires one or more sets of a high resolution image and a low resolution image from the storage unit 111. The low resolution image and the high resolution image corresponding to each other include the same object. The low resolution image may be generated by downsampling the high resolution image (wherein the low resolution image may be degraded by adding compressive noise and later unsampled to match the size of the high resolution image). The generator can be provided with a function of correcting image degradation in addition to a resolution enhancement function. In paragraph [0040]-HIASA discloses in step S102, the calculation unit 113 inputs the low resolution image to the generator to generate the first and second intermediate high resolution images. In paragraph [0052]-HIASA discloses the first residual component 203 is summed with the low resolution image 201, and a first intermediate high resolution image 205 is generated. The second residual component 204 is summed with the low resolution image 201, and a second intermediate high resolution image 206 is generated), and the fifth loss function is an L1 loss function representing a mean absolute error between the fifth feature and the sixth feature (Fig. 4. Paragraph [0059]-HIASA discloses in step S107, the calculation unit 113 inputs the second intermediate high resolution image 206 and the high resolution image individually to the discriminator to generate respective discrimination outputs. The discriminator discriminates whether the input image is the high resolution image generated by the generator or an actual high resolution image. In paragraph [0061]-HIASA discloses in step S109, the update unit 114 updates the weight of the generator based on the first loss and the second loss. Only the first loss is calculated with respect to the first intermediate high resolution image 205. A weighted sum of the first loss and the second loss is calculated with respect to the second intermediate high resolution image 206. The second loss is the sigmoid cross entropy between the discrimination output obtained by inputting the second intermediate high resolution image 206 to the discriminator and the ground truth. A sum of the losses of the first intermediate high resolution image 205 and the second intermediate high resolution image 206 is regarded as the loss function of the generator (wherein a third loss may also be used). Therefore, it would have been obvious to one of ordinary skill to specifically calculate a ratio of each L1 loss given HIASA updates the generator and discriminator based on weighted sums and/or comparisons of L1 and cross entropy losses. This may further improve the learning processes of the model. Please also read paragraph [0079-0080]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having an electronic device comprising: a processor; and a memory configured to store instructions executable by the processor, wherein the processor is configured to read the executable instructions from the memory, and execute the instructions to implement a GAN-based super-resolution image processing, with the teachings of HIASA of having wherein the determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function comprises: calculating a ratio of the fourth loss function to the fifth loss function to obtain the second contrastive learning loss function, wherein the fourth loss function is an L1 loss function representing a mean absolute error between the fourth feature and the sixth feature, and the fifth loss function is an L1 loss function representing a mean absolute error between the fifth feature and the sixth feature.
Wherein GAO’s electronic device having wherein the determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function comprises: calculating a ratio of the fourth loss function to the fifth loss function to obtain the second contrastive learning loss function, wherein the fourth loss function is an L1 loss function representing a mean absolute error between the fourth feature and the sixth feature, and the fifth loss function is an L1 loss function representing a mean absolute error between the fifth feature and the sixth feature.
The motivation behind the modification would have been to obtain an electronic device that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and HIASA concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while HIASA’s systems and methods improves the performance of generator and discriminator models and the resolution enhancement of images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and HIASA et al. (US 20220076071 A1), Abstract and Paragraph [0075-0080].
Regarding claim 15, GAO in view of HIASA and in further view of ICHINO explicitly teaches the electronic device according to claim 10, GAO fails to explicitly teach wherein the method further comprises: extracting a second feature corresponding to the negative sample image by using the discriminative model of the GAN, and determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature, wherein the first contrastive learning loss function is used for enabling the feature of the reference sample image to be close to the feature of the negative sample image and far away from the feature of the positive sample image; and the training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super- resolution network comprises: training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network.
However, ICHINO explicitly teaches wherein the method further comprises:
extracting a second feature corresponding to the negative sample image by using the discriminative model of the GAN (Fig. 3. Paragraph [0096]-ICHINO discloses FIG. 4 is a diagram showing an example of a processing procedure for performing a learning process of the feature conversion unit 105 in the feature conversion learning device 200. In paragraph [0101]-ICHINO discloses in the process of the loop L11, the image reduction unit 202 reduces a high-resolution image of the labeled image (step S221). The image reduction unit 202 reduces the image by thinning out the pixels of the image. The reduced image corresponds to a low-resolution image. In paragraph [0102]-ICHINO discloses the image enlargement unit 103 enlarges the reduced image (step S222) (wherein the enlarged image is referred to as a deteriorated image). In paragraph [0103]-ICHINO discloses the feature extraction unit 104 extracts a feature of the deteriorated image (step S223). In paragraph [0104]-ICHINO discloses the feature conversion unit 105 converts the feature of the deteriorated image (step S224) (wherein a feature after conversion is referred to as a super-resolution feature). In paragraph [0105]-ICHINO discloses the feature extraction unit 104 extracts a feature of the high-resolution image of the labeled image (step S231). In paragraph [0108]-ICHINO discloses after the end of the loop L11, the feature conversion learning device 200 calculates a loss for a learning process of the feature conversion unit 105 (step S242)), and determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature (Fig. 4. Paragraph [0109]-ICHINO discloses FIG. 5 is a diagram showing a processing procedure in which the feature conversion learning device 200 calculates a loss. In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The adversarial loss is a loss whose value decreases when the discriminator makes erroneous determination (wherein the adversarial loss may use a binary cross entropy loss). In paragraph [0117]-ICHINO discloses the reconstruction loss calculation unit 204 calculates a reconstruction loss that decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases (step S261). In paragraph [0118]-ICHINO discloses the reconstruction loss calculation unit 204 may be configured to calculate the reconstruction loss so that the reconstruction loss decreases as a distance L2 associated with the feature of the super-resolution feature and the feature of the high-resolution image decreases (wherein the reconstruction loss may also use an L1 distance). In paragraph [0119]-ICHINO discloses the similarity loss calculation unit 207 calculates the similarity loss using the class label (wherein the similarity loss may also be a contrastive loss function)), wherein the first contrastive learning loss function is used for enabling the feature of the reference sample image to be close to the feature of the negative sample image and far away from the feature of the positive sample image (Fig. 4. Paragraph [0117]-ICHINO discloses the learning control unit 208 adjusts the machine learning model parameter value of the feature conversion unit 105 so that the super-resolution feature is closer to the feature of the high-resolution image. In paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the total loss value decreases as the other losses decrease (i.e. adversarial, reconstruction and similarity)). Therefore, it would have been obvious to one of ordinary skill in the art to extract a second feature from a deteriorated image that has been generated through the well-known techniques of fusion and noise injection. In ICHINO, the contrastive loss functions are based on multiple losses and/or loss functions as well as features extracted from high-resolution ground truth images, low-resolution/deteriorated images and converted super-resolution images. In addition, the distances between features of the ground truth and the deteriorated/super resolution images are greater due to degradation/enlargement, and the loss functions are designed to minimize the distances between ground truth and super resolution images. Thus, it would have been obvious to use noise and/or resolution reduction to obtain the degraded image. This would have improved the generation of negative samples and enhanced the training of the learning model. Please also read paragraph [0111, 0117-0119, and 0143]); and
the training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super- resolution network (Fig. 4. Paragraph [0110]-ICHINO discloses in the process of FIG. 5, the feature discrimination unit 205 outputs a vector for determining (discriminating) whether the input feature is a super-resolution feature or a feature of a high-resolution image (step S251). In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The feature conversion unit 105 and the feature discrimination unit 205 have a structure of a generative adversarial network (GAN) and the adversarial loss calculation unit 206 calculates the adversarial loss using the vector output by the feature discrimination unit 205) comprises:
training the parameters of the generative model by performing backpropagation (Fig. 4. Paragraph [0110]-ICHINO discloses [0128]-ICHINO discloses after the process of FIG. 5 is completed, the learning control unit 208 calculates a gradient of a parameter of the neural network of the feature conversion unit 105 using an error backpropagation method (backpropagation) (step S243). In paragraph [0130]-ICHINO discloses the learning control unit 208 optimizes the parameter value so that the loss function value is minimized. In paragraph [0131]-ICHINO discloses the feature conversion learning device 200 calculates a loss for a learning process of the feature discrimination unit 205 (step S245). In paragraph [0132]-ICHINO discloses the learning control unit 208 calculates the gradient of the parameter of the neural network of the feature discrimination unit 205 using the error backpropagation method (step S246)) based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network (Fig. 4. Paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the adversarial loss is uses a binary cross entropy loss, the similarity loss may be configured to use a contrastive loss, and the reconstruction loss decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases and may use either L1 or L2 distance))..
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having a an electronic device, comprising: a processor; and a memory configured to store instructions executable by the processor, wherein the processor is configured to read the executable instructions from the memory, and execute the instructions to implement a GAN-based super-resolution image processing, with the teachings of ICHINO of having wherein the method further comprises: extracting a second feature corresponding to the negative sample image by using the discriminative model of the GAN, and determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature, wherein the first contrastive learning loss function is used for enabling the feature of the reference sample image to be close to the feature of the negative sample image and far away from the feature of the positive sample image; and the training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super- resolution network comprises: training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network.
Wherein GAO’s electronic device having wherein the method further comprises: extracting a second feature corresponding to the negative sample image by using the discriminative model of the GAN, and determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature, wherein the first contrastive learning loss function is used for enabling the feature of the reference sample image to be close to the feature of the negative sample image and far away from the feature of the positive sample image; and the training parameters of the generative model by performing backpropagation based on the BCE loss function and the second contrastive learning loss function, to obtain a target super- resolution network comprises: training the parameters of the generative model by performing backpropagation based on the BCE loss function, the first contrastive learning loss function, and the second contrastive learning loss function, to obtain the target super-resolution network.
The motivation behind the modification would have been to obtain an electronic device that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and ICHINO concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while ICHINO’s systems and methods improves the accuracy of authentication processes and the ability to produce super resolution images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and ICHINO et al. (US 20230360440 A1), and Paragraph [0043, 0066 and 0081-0092].
Regarding claim 16, GAO in view of HIASA and in further view of ICHINO explicitly teaches the electronic device according to claim 15, GAO further teaches determining a first loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]) based on the second feature and the third feature (Fig. 6. Paragraph [0048]-GAO discloses the method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, a step 506 of adding DC background to the second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein in training mode, the low-resolution images 114 generated by the Simulation module 110 are fed to Generator G, which is composed of a residual network module and multiscale upsampling component). In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. The Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model and simultaneously to ensure that the dimensionality of the images increases smoothly and gradually without introducing too much fake detail. Please also read paragraph [0068-0073]);
determining a second loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]) based on the first feature and the third feature (Fig. 6. Paragraph [0048]-GAO discloses the method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, a step 506 of adding DC background to the second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein in training mode, the low-resolution images 114 generated by the Simulation module 110 are fed to Generator G, which is composed of a residual network module and multiscale upsampling component). In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. The Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model and simultaneously to ensure that the dimensionality of the images increases smoothly and gradually without introducing too much fake detail. Please also read paragraph [0068-0073]); and
GAO fails to explicitly teach wherein the determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature comprises: determining the first contrastive learning loss function based on the first loss function and the second loss function.
However, ICHINO explicitly teaches wherein the determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature (Fig. 3. Paragraph [0096]-ICHINO discloses FIG. 4 is a diagram showing an example of a processing procedure for performing a learning process of the feature conversion unit 105 in the feature conversion learning device 200. In paragraph [0101]-ICHINO discloses in the process of the loop L11, the image reduction unit 202 reduces a high-resolution image of the labeled image (step S221). The image reduction unit 202 reduces the image by thinning out the pixels of the image. The reduced image corresponds to a low-resolution image. In paragraph [0102]-ICHINO discloses the image enlargement unit 103 enlarges the reduced image (step S222) (wherein the enlarged image is referred to as a deteriorated image). In paragraph [0103]-ICHINO discloses the feature extraction unit 104 extracts a feature of the deteriorated image (step S223). In paragraph [0104]-ICHINO discloses the feature conversion unit 105 converts the feature of the deteriorated image (step S224) (wherein a feature after conversion is referred to as a super-resolution feature). In paragraph [0105]-ICHINO discloses the feature extraction unit 104 extracts a feature of the high-resolution image of the labeled image (step S231). In paragraph [0108]-ICHINO discloses after the end of the loop L11, the feature conversion learning device 200 calculates a loss for a learning process of the feature conversion unit 105 (step S242)) comprises:
determining the first contrastive learning loss function (Fig. 4. Paragraph [0109]-ICHINO discloses FIG. 5 is a diagram showing a processing procedure in which the feature conversion learning device 200 calculates a loss. In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The adversarial loss is a loss whose value decreases when the discriminator makes erroneous determination (wherein the adversarial loss may use a binary cross entropy loss). In paragraph [0117]-ICHINO discloses the reconstruction loss calculation unit 204 calculates a reconstruction loss that decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases (step S261). In paragraph [0118]-ICHINO discloses the reconstruction loss calculation unit 204 may be configured to calculate the reconstruction loss so that the reconstruction loss decreases as a distance L2 associated with the feature of the super-resolution feature and the feature of the high-resolution image decreases (wherein the reconstruction loss may also use an L1 distance). In paragraph [0119]-ICHINO discloses the similarity loss calculation unit 207 calculates the similarity loss using the class label (wherein the similarity loss may also be a contrastive loss function)) based on the first loss function and the second loss function (Fig. 4. Paragraph [0117]-ICHINO discloses the learning control unit 208 adjusts the machine learning model parameter value of the feature conversion unit 105 so that the super-resolution feature is closer to the feature of the high-resolution image. In paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the total loss value decreases as the other losses decrease (i.e. adversarial, reconstruction and similarity));
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having a an electronic device, comprising: a processor; and a memory configured to store instructions executable by the processor, wherein the processor is configured to read the executable instructions from the memory, and execute the instructions to implement a GAN-based super-resolution image processing, with the teachings of ICHINO of having wherein the determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature comprises: determining the first contrastive learning loss function based on the first loss function and the second loss function.
Wherein GAO’s electronic device having wherein the determining a first contrastive learning loss function based on the first feature, the second feature, and the third feature comprises: determining the first contrastive learning loss function based on the first loss function and the second loss function.
The motivation behind the modification would have been to obtain an electronic device that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and ICHINO concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while ICHINO’s systems and methods improves the accuracy of authentication processes and the ability to produce super resolution images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and ICHINO et al. (US 20230360440 A1), and Paragraph [0043, 0066 and 0081-0092].
Regarding claim 17, GAO in view of HIASA and in further view of ICHINO explicitly teaches the electronic device according to claim 16, GAO fails to explicitly teach wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function comprises: calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function, wherein the first loss function is an L1 loss function representing a mean absolute error between the second feature and the third feature, and the second loss function is an L1loss function representing a mean absolute error between the first feature and the third feature.
However, HIASA explicitly teaches wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function (Fig. 1. Paragraph [0032]-HIASA discloses the image processing system 100 includes a learning apparatus 101, a resolution enhancement apparatus 102, and a control apparatus 103. In paragraph [0023]-HIASA discloses in the present exemplary embodiment, a generator which is a machine learning model converts a low-resolution image into a feature map (first feature map) and generates, from the first feature map, two intermediate images (a first intermediate image and a second intermediate image) having higher resolution than that of the low-resolution image. In paragraph [0024]-HIASA discloses the generator is trained by using different loss functions for the two intermediate high-resolution images. The loss functions include a first loss based on a difference between the intermediate high-resolution image and a high resolution image which is ground truth (a ground truth image), and a second loss which is defined based on a discrimination output from a discriminator which discriminates whether an input image is an image generated by the generator. Please also read paragraph [0025-0026, 0054-0061 and 0079-0080]) comprises:
calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function (Fig. 4. Paragraph [0054]-HIASA discloses in step S103, the update unit 114 updates the weight of the generator based on the first loss. The first loss is a loss defined based on a difference between the high-resolution image (ground truth image) corresponding to the low-resolution image 201 and the intermediate high-resolution image. Mean squared error (MSE) is used, but mean absolute error (MAE) or the like may be used. In paragraph [0055]-HIASA discloses a sum of the MSE of the first intermediate high resolution image 205 and the high resolution image and the MSE of the second intermediate high resolution image 206 and the high resolution image is used as the loss function, and the weight of the generator is updated by backpropagation (wherein the update unit 114 determines whether the first learning is completed by determining whether an amount of change in the weight at the time of update is smaller than a predetermined value)), wherein the first loss function is an L1 loss function representing a mean absolute error between the second feature and the third feature (Fig. 4. Paragraph [0039]-HIASA discloses in step S101, the acquisition unit 112 acquires one or more sets of a high resolution image and a low resolution image from the storage unit 111. The low resolution image and the high resolution image corresponding to each other include the same object. The low resolution image may be generated by downsampling the high resolution image (wherein the low resolution image may be degraded by adding compressive noise and later unsampled to match the size of the high resolution image). The generator can be provided with a function of correcting image degradation in addition to a resolution enhancement function. In paragraph [0040]-HIASA discloses in step S102, the calculation unit 113 inputs the low resolution image to the generator to generate the first and second intermediate high resolution images. In paragraph [0052]-HIASA discloses the first residual component 203 is summed with the low resolution image 201, and a first intermediate high resolution image 205 is generated. The second residual component 204 is summed with the low resolution image 201, and a second intermediate high resolution image 206 is generated), and the second loss function is an L1 loss function representing a mean absolute error between the first feature and the third feature (Fig. 4. Paragraph [0059]-HIASA discloses in step S107, the calculation unit 113 inputs the second intermediate high-resolution image 206 and the high resolution image individually to the discriminator to generate respective discrimination outputs. The discriminator discriminates whether the input image is the high-resolution image generated by the generator or an actual high-resolution image. In paragraph [0061]-HIASA discloses in step S109, the update unit 114 updates the weight of the generator based on the first loss and the second loss. Only the first loss is calculated with respect to the first intermediate high-resolution image 205. A weighted sum of the first loss and the second loss is calculated with respect to the second intermediate high-resolution image 206. The second loss is the sigmoid cross entropy between the discrimination output obtained by inputting the second intermediate high-resolution image 206 to the discriminator and the ground truth. A sum of the losses of the first intermediate high-resolution image 205 and the second intermediate high-resolution image 206 is regarded as the loss function of the generator (wherein a third loss may also be used). Therefore, it would have been obvious to one of ordinary skill to specifically calculate a ratio of each L1 loss given HIASA updates the generator and discriminator based on weighted sums and/or comparisons of L1 and cross entropy losses. This may further improve the learning processes of the model. Please also read paragraph [0079-0080]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having an electronic device comprising: a processor; and a memory configured to store instructions executable by the processor, wherein the processor is configured to read the executable instructions from the memory, and execute the instructions to implement a GAN-based super-resolution image processing, with the teachings of HIASA of having wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function comprises: calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function, wherein the first loss function is an L1loss function representing a mean absolute error between the second feature and the third feature, and the second loss function is an L1loss function representing a mean absolute error between the first feature and the third feature.
Wherein GAO’s electronic device having wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function comprises: calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function, wherein the first loss function is an L1loss function representing a mean absolute error between the second feature and the third feature, and the second loss function is an L1loss function representing a mean absolute error between the first feature and the third feature.
The motivation behind the modification would have been to obtain an electronic device that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and HIASA concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while HIASA’s systems and methods improves the performance of generator and discriminator models and the resolution enhancement of images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and HIASA et al. (US 20220076071 A1), Abstract and Paragraph [0075-0080].
Regarding claim 20, GAO in view of HIASA and in further view of ICHINO explicitly teaches the non-transitory computer-readable storage medium according to claim 11, GAO further teaches
determining a fourth loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]) based on the fourth feature and the sixth feature (Fig. 6. Paragraph [0048]-GAO discloses the method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, a step 506 of adding DC background to the second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein in training mode, the low-resolution images 114 generated by the Simulation module 110 are fed to Generator G, which is composed of a residual network module and multiscale upsampling component). In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. The Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model and simultaneously to ensure that the dimensionality of the images increases smoothly and gradually without introducing too much fake detail. Please also read paragraph [0068-0073]);
determining a fifth loss function (Fig. 7. Paragraph [0053]-GAO discloses the generator is forced to optimize the generative loss, which is composed of (1) perceptual loss, (2) content loss, and (3) adversarial loss. Further in paragraph [0062]-GAO discloses FIG. 6 shows that depending on various scores 630 and 632 (where score 630 shows an example the discriminator scoring the super-resolution image generated by the novel model while score 632 shows an example of the discriminator scoring the true high-resolution image), loss of the generator G and discriminator D are evaluated in blocks 640 and 642 (block 640 shows the loss used to train the generator while block 642 shows the loss used to train the discriminator network) and finally the targets 650, 652 and 654 show the ground truth labels, and are used to calculate the losses of the generator and discriminator. Please also read paragraph [0063-0066]) based on the fifth feature and the sixth feature (Fig. 6. Paragraph [0048]-GAO discloses the method includes a step 500 of receiving a first image 112 having a first resolution, a step 504 of generating a plurality of second images 302, having the first resolution, a step 506 of adding DC background to the second images 302 to generate a plurality of third images 304, having the first resolution, a step 508 of downsampling the plurality of third images 304 to obtain a plurality of fourth images 306, which have a second resolution, lower than the first resolution, and a step 510 of generating low-resolution images 114 by adding noise to the plurality of fourth images, where the time-series, low-resolution images 114 have the second resolution (wherein in training mode, the low-resolution images 114 generated by the Simulation module 110 are fed to Generator G, which is composed of a residual network module and multiscale upsampling component). In paragraph [0060]-GAO discloses the generator model can output and thus calculate the training error of multi-scale super-resolution images, ranging from 2× to 8×, which means that the model has multiple training interfaces 760, 762, and 764 for back propagation. The Deep Learning module uses the 2×, 4×, 8× high-resolution ground-truth images 750, 752, and 754 to tune the model and simultaneously to ensure that the dimensionality of the images increases smoothly and gradually without introducing too much fake detail. Please also read paragraph [0068-0073]); and
GAO fails to explicitly teach wherein the determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature comprises: determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function.
However, ICHINO explicitly teaches wherein the determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature (Fig. 3. Paragraph [0096]-ICHINO discloses FIG. 4 is a diagram showing an example of a processing procedure for performing a learning process of the feature conversion unit 105 in the feature conversion learning device 200. In paragraph [0101]-ICHINO discloses in the process of the loop L11, the image reduction unit 202 reduces a high-resolution image of the labeled image (step S221). The image reduction unit 202 reduces the image by thinning out the pixels of the image. The reduced image corresponds to a low-resolution image. In paragraph [0102]-ICHINO discloses the image enlargement unit 103 enlarges the reduced image (step S222) (wherein the enlarged image is referred to as a deteriorated image). In paragraph [0103]-ICHINO discloses the feature extraction unit 104 extracts a feature of the deteriorated image (step S223). In paragraph [0104]-ICHINO discloses the feature conversion unit 105 converts the feature of the deteriorated image (step S224) (wherein a feature after conversion is referred to as a super-resolution feature). In paragraph [0105]-ICHINO discloses the feature extraction unit 104 extracts a feature of the high-resolution image of the labeled image (step S231). In paragraph [0108]-ICHINO discloses after the end of the loop L11, the feature conversion learning device 200 calculates a loss for a learning process of the feature conversion unit 105 (step S242)) comprises:
determining the second contrastive learning loss function (Fig. 4. Paragraph [0109]-ICHINO discloses FIG. 5 is a diagram showing a processing procedure in which the feature conversion learning device 200 calculates a loss. In paragraph [0111]-ICHINO discloses the adversarial loss calculation unit 206 calculates the loss using the vector output by the feature discrimination unit 205 (step S252). The adversarial loss is a loss whose value decreases when the discriminator makes erroneous determination (wherein the adversarial loss may use a binary cross entropy loss). In paragraph [0117]-ICHINO discloses the reconstruction loss calculation unit 204 calculates a reconstruction loss that decreases as the similarity between the super-resolution feature and the feature of the high-resolution image increases (step S261). In paragraph [0118]-ICHINO discloses the reconstruction loss calculation unit 204 may be configured to calculate the reconstruction loss so that the reconstruction loss decreases as a distance L2 associated with the feature of the super-resolution feature and the feature of the high-resolution image decreases (wherein the reconstruction loss may also use an L1 distance). In paragraph [0119]-ICHINO discloses the similarity loss calculation unit 207 calculates the similarity loss using the class label (wherein the similarity loss may also be a contrastive loss function)) based on the fourth loss function and the fifth loss function (Fig. 4. Paragraph [0117]-ICHINO discloses the learning control unit 208 adjusts the machine learning model parameter value of the feature conversion unit 105 so that the super-resolution feature is closer to the feature of the high-resolution image. In paragraph [0125]-ICHINO discloses after steps S252, S261, and S271, the loss function calculation unit 203 calculates a total loss function value on the basis of the adversarial loss, the reconstruction loss, and the similarity loss (step S261) (wherein the total loss value decreases as the other losses decrease (i.e. adversarial, reconstruction and similarity))).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having a non-transitory computer-readable storage medium having stored thereon a computer program for performing a GAN-based super-resolution image processing method, with the teachings of ICHINO of having wherein the determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature comprises: determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function.
Wherein GAO’s non-transitory computer-readable storage medium having wherein the determining a second contrastive learning loss function based on the fourth feature, the fifth feature, and the sixth feature comprises: determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function.
The motivation behind the modification would have been to obtain a non-transitory computer-readable storage medium that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and ICHINO concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while ICHINO’s systems and methods improves the accuracy of authentication processes and the ability to produce super resolution images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and ICHINO et al. (US 20230360440 A1), and Paragraph [0043, 0066 and 0081-0092].
Regarding claim 21, GAO in view of HIASA and in further view of ICHINO explicitly teaches the non-transitory computer-readable storage medium according to claim 20, GAO fails to explicitly teach wherein the determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function comprises: calculating a ratio of the fourth loss function to the fifth loss function to obtain the second contrastive learning loss function, wherein the fourth loss function is an L1 loss function representing a mean absolute error between the fourth feature and the sixth feature, and the fifth loss function is an L1 loss function representing a mean absolute error between the fifth feature and the sixth feature.
However, HIASA explicitly teaches wherein the determining the second contrastive learning loss function based on the fourth loss function and the fifth loss function (Fig. 1. Paragraph [0032]-HIASA discloses the image processing system 100 includes a learning apparatus 101, a resolution enhancement apparatus 102, and a control apparatus 103. In paragraph [0023]-HIASA discloses in the present exemplary embodiment, a generator which is a machine learning model converts a low resolution image into a feature map (first feature map) and generates, from the first feature map, two intermediate images (a first intermediate image and a second intermediate image) having higher resolution than that of the low resolution image. In paragraph [0024]-HIASA discloses the generator is trained by using different loss functions for the two intermediate high resolution images. The loss functions include a first loss based on a difference between the intermediate high resolution image and a high resolution image which is ground truth (a ground truth image), and a second loss which is defined based on a discrimination output from a discriminator which discriminates whether an input image is an image generated by the generator. Please also read paragraph [0025-0026, 0054-0061 and 0079-0080]) comprises:
calculating a ratio of the fourth loss function to the fifth loss function to obtain the second contrastive learning loss function (Fig. 4. Paragraph [0054]-HIASA discloses in step S103, the update unit 114 updates the weight of the generator based on the first loss. The first loss is a loss defined based on a difference between the high resolution image (ground truth image) corresponding to the low resolution image 201 and the intermediate high resolution image. Mean squared error (MSE) is used, but mean absolute error (MAE) or the like may be used. In paragraph [0055]-HIASA discloses a sum of the MSE of the first intermediate high resolution image 205 and the high resolution image and the MSE of the second intermediate high resolution image 206 and the high resolution image is used as the loss function, and the weight of the generator is updated by backpropagation (wherein the update unit 114 determines whether the first learning is completed by determining whether an amount of change in the weight at the time of update is smaller than a predetermined value)), wherein the fourth loss function is an L1 loss function representing a mean absolute error between the fourth feature and the sixth feature (Fig. 4. Paragraph [0039]-HIASA discloses in step S101, the acquisition unit 112 acquires one or more sets of a high resolution image and a low resolution image from the storage unit 111. The low resolution image and the high resolution image corresponding to each other include the same object. The low resolution image may be generated by downsampling the high resolution image (wherein the low resolution image may be degraded by adding compressive noise and later unsampled to match the size of the high resolution image). The generator can be provided with a function of correcting image degradation in addition to a resolution enhancement function. In paragraph [0040]-HIASA discloses in step S102, the calculation unit 113 inputs the low resolution image to the generator to generate the first and second intermediate high resolution images. In paragraph [0052]-HIASA discloses the first residual component 203 is summed with the low resolution image 201, and a first intermediate high resolution image 205 is generated. The second residual component 204 is summed with the low resolution image 201, and a second intermediate high resolution image 206 is generated), and the fifth loss function is an L1 loss function representing a mean absolute error between the fifth feature and the sixth feature (Fig. 4. Paragraph [0059]-HIASA discloses in step S107, the calculation unit 113 inputs the second intermediate high resolution image 206 and the high resolution image individually to the discriminator to generate respective discrimination outputs. The discriminator discriminates whether the input image is the high resolution image generated by the generator or an actual high resolution image. In paragraph [0061]-HIASA discloses in step S109, the update unit 114 updates the weight of the generator based on the first loss and the second loss. Only the first loss is calculated with respect to the first intermediate high resolution image 205. A weighted sum of the first loss and the second loss is calculated with respect to the second intermediate high resolution image 206. The second loss is the sigmoid cross entropy between the discrimination output obtained by inputting the second intermediate high resolution image 206 to the discriminator and the ground truth. A sum of the losses of the first intermediate high resolution image 205 and the second intermediate high resolution image 206 is regarded as the loss function of the generator (wherein a third loss may also be used). Therefore, it would have been obvious to one of ordinary skill to specifically calculate a ratio of each L1 loss given HIASA updates the generator and discriminator based on weighted sums and/or comparisons of L1 and cross entropy losses. This may further improve the learning processes of the model. Please also read paragraph [0079-0080]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having a non-transitory computer-readable storage medium having stored thereon a computer program for performing a GAN-based super-resolution image processing method, with the teachings of HIASA of having wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function comprises: calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function, wherein the first loss function is an L1loss function representing a mean absolute error between the second feature and the third feature, and the second loss function is an L1loss function representing a mean absolute error between the first feature and the third feature.
Wherein GAO’s non-transitory computer-readable storage medium having wherein the determining the first contrastive learning loss function based on the first loss function and the second loss function comprises: calculating a ratio of the first loss function to the second loss function to obtain the first contrastive learning loss function, wherein the first loss function is an L1loss function representing a mean absolute error between the second feature and the third feature, and the second loss function is an L1loss function representing a mean absolute error between the first feature and the third feature.
The motivation behind the modification would have been to obtain a non-transitory computer-readable storage medium that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and HIASA concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while HIASA’s systems and methods improves the performance of generator and discriminator models and the resolution enhancement of images. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and HIASA et al. (US 20220076071 A1), Abstract and Paragraph [0075-0080].
Claims 2, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over GAO et al. (US 20200357096 A1), hereinafter referenced as GAO in view of HIASA et al. (US 20220076071 A1), hereinafter referenced as HIASA and in further view of ICHINO et al. (US 20230360440 A1), hereinafter referenced as ICHINO and in further view of BAI et al. (US 20210125313 A1), hereinafter referenced as BAI.
Regarding claim 2, GAO in view of HIASA and in further view of ICHINO explicitly teaches the method according to claim 1, GAO further teaches wherein a process of generating the negative sample image comprises:
up-sampling the input sample image to obtain a candidate sample image with the same size as the positive sample image (Fig. 6. Paragraph [0047]-GAO discloses the default setting of the simulation takes a 480×480 pixel high-resolution image 112 as the input and simulates 200 frames of 60×60 pixel low-resolution images 114. In paragraph [0050]-GAO discloses a deep residual network is built under the generative adversarial network (GAN) framework to estimate the primitive super-resolution image I.sup.SR (the latent structure features) from the time-series of low-resolution fluorescent images. In paragraph [0055]-GAO discloses the input to the Deep Learning module 120, for the training mode 140, is the time-series low-resolution images 114 generated by the Simulation module 110. For the analysis mode 150, the input would be the low-resolution images derived from an actual microscope (wherein deep learning module contains a multiscale upsampling component with pixel shuffle layers). The pixel shuffle layers, which is used to perform the upscaling of the figure dimensionality, is capable of outputting 2×, 4×, and 8× high-resolution images 750, 752, and 754. Please also read paragraph [0060, 0062-0066]);
and adding Gaussian random noise to the fused image to generate the negative sample image (Fig. 3. Paragraph [0046]-GAO discloses the high-resolution fluorescent images 304 are downsampled in step 303 and random Gaussian noise is added in step 305 to the low-resolution images 306. Here, the noise is also stochastic for different time-series and close to the noise strength that is measured from the real-world microscopy).
GAO fails to explicitly teach determining a first weight corresponding to the candidate sample image, and determining a second weight corresponding to the positive sample image; summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image.
However, BAI explicitly teaches determining a first weight corresponding to the candidate sample image, and determining a second weight (Fig. 16. Paragraph [0163]-BAI discloses after obtaining the fusion feature maps of multiple layers based on the image to be processed and the corresponding mask image, in the image decoding part, the up-sampling processing may be performed based on the fusion feature maps of the multiple layers to obtain the inpainting result (wherein gaussian noise images with the same size may be generated and combined with the inpainting results). In paragraph [0164]-BAI discloses the object map may be any fusion feature map and can also be an inpainting result (wherein the fusion feature map is obtained by weighting, the processed object map is obtained by generating and fusing a first, second, third and fourth clipped map). In paragraph [0172]-BAI discloses the first weight map and the second weight map may be randomly generated images, such as a noise image in which the element values only include 0 and 1 in the randomly generated image) corresponding to the positive sample image (Fig. 16. Paragraph [0109]-BAI discloses the original image is a high-definition image (e.g., an image with a resolution greater than 1024*1024 pixels) (wherein the original image is used for training/implementing an image inpainting deep learning network, the network contains a generative adversarial model with pair and global discriminators that determine the probability/loss of whether the generated images are original images, and the image/inpainting result may be super resolved and spliced with an original image and clipped region). In paragraph [0216]-BAI discloses random total variation loss (RTV loss) function is provided when training the image inpainting network (wherein the RTV loss characterizes the difference between the object map subjected to the element value exchange and/or the element value adjustment and the original image corresponding to the object map). Please also read paragraph [0157 and 0213-0219]);
summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image (Fig. 16. Paragraph [0164]-BAI discloses for each element point in the fused image, the element values of the element points in the first clipped map and the second clipped map are randomly selected to realize random exchange of element values based on the fusion method. The fusion of the first clipped map and the second clipped map may be implemented based on the following Equation 1: A*X1+(1−A)*X2 (wherein X1 and X2 represent a first clipped map and a second clipped map, respectively, and A and 1−A respectively represent a first weight map and a second weight map). Please also read paragraph [0213-0219]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA of having a GAN-based super-resolution image processing method, comprising: obtaining a positive sample image, a negative sample image, and a reference sample image, with the teachings of BAI of having determining a first weight corresponding to the candidate sample image, and determining a second weight corresponding to the positive sample image; summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image.
Wherein GAO’s method having determining a first weight corresponding to the candidate sample image, and determining a second weight corresponding to the positive sample image; summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image.
The motivation behind the modification would have been to obtain a method that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and BAI concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while BAI’s systems and methods improves image processing efficiency and image inpainting results with super resolution processing. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and BAI et al. (US 20210125313 A1), and Abstract and Paragraph [0004, 0104 and 0116].
Regarding claim 12, GAO in view of HIASA and in further view of ICHINO explicitly teaches the electronic device according to claim 10, GAO further teaches wherein a process of generating the negative sample image comprises:
up sampling the input sample image to obtain a candidate sample image with the same size as the positive sample image (Fig. 6. Paragraph [0047]-GAO discloses the default setting of the simulation takes a 480×480 pixel high-resolution image 112 as the input and simulates 200 frames of 60×60 pixel low-resolution images 114. In paragraph [0050]-GAO discloses a deep residual network is built under the generative adversarial network (GAN) framework to estimate the primitive super-resolution image I.sup.SR (the latent structure features) from the time-series of low-resolution fluorescent images. In paragraph [0055]-GAO discloses the input to the Deep Learning module 120, for the training mode 140, is the time-series low-resolution images 114 generated by the Simulation module 110. For the analysis mode 150, the input would be the low-resolution images derived from an actual microscope (wherein deep learning module contains a multiscale upsampling component with pixel shuffle layers). The pixel shuffle layers, which is used to perform the upscaling of the figure dimensionality, is capable of outputting 2×, 4×, and 8× high-resolution images 750, 752, and 754. Please also read paragraph [0060, 0062-0066]); and
adding Gaussian random noise to the fused image to generate the negative sample image (Fig. 3. Paragraph [0046]-GAO discloses the high-resolution fluorescent images 304 are downsampled in step 303 and random Gaussian noise is added in step 305 to the low-resolution images 306. Here, the noise is also stochastic for different time-series and close to the noise strength that is measured from the real-world microscopy).
GAO in view of HIASA and in further view of ICHINO fail to explicitly teach determining a first weight corresponding to the candidate sample image, and determining a second weight corresponding to the positive sample image; summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image.
However, BAI explicitly teaches determining a first weight corresponding to the candidate sample image, and determining a second weight (Fig. 16. Paragraph [0163]-BAI discloses after obtaining the fusion feature maps of multiple layers based on the image to be processed and the corresponding mask image, in the image decoding part, the up-sampling processing may be performed based on the fusion feature maps of the multiple layers to obtain the inpainting result (wherein gaussian noise images with the same size may be generated and combined with the inpainting results). In paragraph [0164]-BAI discloses the object map may be any fusion feature map and can also be an inpainting result (wherein the fusion feature map is obtained by weighting, the processed object map is obtained by generating and fusing a first, second, third and fourth clipped map). In paragraph [0172]-BAI discloses the first weight map and the second weight map may be randomly generated images, such as a noise image in which the element values only include 0 and 1 in the randomly generated image) corresponding to the positive sample image (Fig. 16. Paragraph [0109]-BAI discloses the original image is a high-definition image (e.g., an image with a resolution greater than 1024*1024 pixels) (wherein the original image is used for training/implementing an image inpainting deep learning network, the network contains a generative adversarial model with pair and global discriminators that determine the probability/loss of whether the generated images are original images, and the image/inpainting result may be super resolved and spliced with an original image and clipped region). In paragraph [0216]-BAI discloses random total variation loss (RTV loss) function is provided when training the image inpainting network (wherein the RTV loss characterizes the difference between the object map subjected to the element value exchange and/or the element value adjustment and the original image corresponding to the object map). Please also read paragraph [0157 and 0213-0219]);
summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image (Fig. 16. Paragraph [0164]-BAI discloses for each element point in the fused image, the element values of the element points in the first clipped map and the second clipped map are randomly selected to realize random exchange of element values based on the fusion method. The fusion of the first clipped map and the second clipped map may be implemented based on the following Equation 1: A*X1+(1−A)*X2 (wherein X1 and X2 represent a first clipped map and a second clipped map, respectively, and A and 1−A respectively represent a first weight map and a second weight map). Please also read paragraph [0213-0219]);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having a an electronic device, comprising: a processor; and a memory configured to store instructions executable by the processor, wherein the processor is configured to read the executable instructions from the memory, and execute the instructions to implement a GAN-based super-resolution image processing, with the teachings of BAI of having determining a first weight corresponding to the candidate sample image, and determining a second weight corresponding to the positive sample image; summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image.
Wherein GAO’s electronic device having determining a first weight corresponding to the candidate sample image, and determining a second weight corresponding to the positive sample image; summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image.
The motivation behind the modification would have been to obtain an electronic device that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and BAI concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while BAI’s systems and methods improves image processing efficiency and image inpainting results with super resolution processing. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and BAI et al. (US 20210125313 A1), and Abstract and Paragraph [0004, 0104 and 0116].
Regarding claim 19, GAO in view of HIASA and in further view of ICHINO explicitly teaches the non-transitory computer-readable storage medium according to claim 11, GAO further teaches wherein a process of generating the negative sample image comprises:
up-sampling the input sample image to obtain a candidate sample image with the same size as the positive sample image (Fig. 6. Paragraph [0047]-GAO discloses the default setting of the simulation takes a 480×480 pixel high-resolution image 112 as the input and simulates 200 frames of 60×60 pixel low-resolution images 114. In paragraph [0050]-GAO discloses a deep residual network is built under the generative adversarial network (GAN) framework to estimate the primitive super-resolution image I.sup.SR (the latent structure features) from the time-series of low-resolution fluorescent images. In paragraph [0055]-GAO discloses the input to the Deep Learning module 120, for the training mode 140, is the time-series low-resolution images 114 generated by the Simulation module 110. For the analysis mode 150, the input would be the low-resolution images derived from an actual microscope (wherein deep learning module contains a multiscale upsampling component with pixel shuffle layers). The pixel shuffle layers, which is used to perform the upscaling of the figure dimensionality, is capable of outputting 2×, 4×, and 8× high-resolution images 750, 752, and 754. Please also read paragraph [0060, 0062-0066]);
and adding Gaussian random noise to the fused image to generate the negative sample image (Fig. 3. Paragraph [0046]-GAO discloses the high-resolution fluorescent images 304 are downsampled in step 303 and random Gaussian noise is added in step 305 to the low-resolution images 306. Here, the noise is also stochastic for different time-series and close to the noise strength that is measured from the real-world microscopy).
GAO fails to explicitly teach determining a first weight corresponding to the candidate sample image, and determining a second weight corresponding to the positive sample image; summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image.
However, BAI explicitly teaches determining a first weight corresponding to the candidate sample image, and determining a second weight (Fig. 16. Paragraph [0163]-BAI discloses after obtaining the fusion feature maps of multiple layers based on the image to be processed and the corresponding mask image, in the image decoding part, the up-sampling processing may be performed based on the fusion feature maps of the multiple layers to obtain the inpainting result (wherein gaussian noise images with the same size may be generated and combined with the inpainting results). In paragraph [0164]-BAI discloses the object map may be any fusion feature map and can also be an inpainting result (wherein the fusion feature map is obtained by weighting, the processed object map is obtained by generating and fusing a first, second, third and fourth clipped map). In paragraph [0172]-BAI discloses the first weight map and the second weight map may be randomly generated images, such as a noise image in which the element values only include 0 and 1 in the randomly generated image) corresponding to the positive sample image (Fig. 16. Paragraph [0109]-BAI discloses the original image is a high-definition image (e.g., an image with a resolution greater than 1024*1024 pixels) (wherein the original image is used for training/implementing an image inpainting deep learning network, the network contains a generative adversarial model with pair and global discriminators that determine the probability/loss of whether the generated images are original images, and the image/inpainting result may be super resolved and spliced with an original image and clipped region). In paragraph [0216]-BAI discloses random total variation loss (RTV loss) function is provided when training the image inpainting network (wherein the RTV loss characterizes the difference between the object map subjected to the element value exchange and/or the element value adjustment and the original image corresponding to the object map). Please also read paragraph [0157 and 0213-0219]);
summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image (Fig. 16. Paragraph [0164]-BAI discloses for each element point in the fused image, the element values of the element points in the first clipped map and the second clipped map are randomly selected to realize random exchange of element values based on the fusion method. The fusion of the first clipped map and the second clipped map may be implemented based on the following Equation 1: A*X1+(1−A)*X2 (wherein X1 and X2 represent a first clipped map and a second clipped map, respectively, and A and 1−A respectively represent a first weight map and a second weight map). Please also read paragraph [0213-0219]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of GAO in view of HIASA and in further view of ICHINO of having a non-transitory computer-readable storage medium having stored thereon a computer program for performing a GAN-based super-resolution image processing method, with the teachings of BAI of having determining a first weight corresponding to the candidate sample image, and determining a second weight corresponding to the positive sample image; summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image.
Wherein GAO’s non-transitory computer-readable storage medium having determining a first weight corresponding to the candidate sample image, and determining a second weight corresponding to the positive sample image; summing a first product of the candidate sample image and the first weight and a second product of the positive sample image and the second weight to obtain a fused image.
The motivation behind the modification would have been to obtain a non-transitory computer-readable storage medium that improves the training of learning models, super resolution image processing, and the quality of high resolution ground truth images, since both GAO and BAI concern generative adversarial learning models and super-resolution image processing. Wherein GAO’s systems and methods provides improves the resolution of an image generated and allows high resolution ground truth images to be obtained from low resolution images, while BAI’s systems and methods improves image processing efficiency and image inpainting results with super resolution processing. Please see GAO et al. (US 20200357096 A1), Abstract and Paragraph [0029, 0032, and 0048-0052], and BAI et al. (US 20210125313 A1), and Abstract and Paragraph [0004, 0104 and 0116].
Conclusion
Listed below are the prior arts made of record and not relied upon but are considered pertinent to applicant`s disclosure.
KEARNEY et al. (US 20210118099 A1)- A novel GAN is trained to predict high fidelity synthetic images based on low quality input dental images. The GAN further takes input anatomic masks as inputs with each image, the masks labeling pixels of the image corresponding to dental features. The GAN includes an encoder-decoder generator with semantically aware normalization between stages of the decoder according to the masks. The predicted synthetic dental image and an unpaired dental image are evaluated by a first discriminator of the GAN to obtain a realism estimate. The synthetic image and an unpaired dental image may be processed using a pretrained dental encoder to obtain a perceptual loss. The GAN is trained with the realism estimate, perceptual loss, and L1 loss. Utilization may include inputting noisy, low contrast, low resolution, blurry, or degraded dental images and outputting high resolution, denoised, high contrast, deobfuscated, and sharp dental images....…...................... Please see Para. [0123-0132, 0208, 0232-0242, 0434-0448]. Abstract (wherein KEARNEY discloses, for example, a super-resolution GAN, a generator/discriminator, both randomized and adversarial injected noise, multiple loss functions/losses (e.g. Loss functions 1-7) including inverse cross entropy, mean squared error and L1, training images including labeled/annotated real images, mask images, low-resolution degraded images, high-resolution images that may be converted from real images and contaminated with gaussian noise, high-resolution synthetic images that may be converted from contaminated images).
REN et al. (US 20210312591 A1)- A method and apparatus are provided. The method includes generating a dataset for real-world super resolution (SR), training a first generative adversarial network (GAN), training a second GAN, and fusing an output of the first GAN and an output of the second GAN....…...................... Please see Fig. 1-4. Abstract.
SHI et al. (US 20210264568 A1)- A neural network is trained to process received visual data to estimate a high-resolution version of the visual data using a training dataset and reference dataset. A set of training data is generated, and a generator convolutional neural network parameterized by first weights and biases is trained by comparing characteristics of the training data to characteristics of the reference dataset. The first network is trained to generate super-resolved image data from low-resolution image data and the training includes modifying first weights and biases to optimize processed visual data based on the comparison between the characteristics of the training data and the characteristics of the reference dataset. A discriminator convolutional neural network parameterized by second weights and biases is trained by comparing characteristics of the generated super-resolved image data to characteristics of the reference dataset, and where the second network is trained to discriminate super-resolved image data from real image data....…...................... Please see Fig. 1-7. Abstract.
EDLUND et al. (US 20230260083 A1)- A computer-implemented method is provided for processing images. The method can include down-sampling a plurality of first images having a first resolution for obtaining a plurality of second images having a second resolution and training an artificial neural network model to process an input image and output an output image having a higher resolution than the input image...…...................... Please see Fig. 1-5. Abstract.
CAI et al. (US 20220399101 A1)- The present disclosure relates to a spatially-variant model of a point spread function and its role in enhancing medical image resolution. For instance, a method of the present disclosure comprises receiving a first medical image having a first resolution, applying a neural network to the first medical image, the neural network including a first subset of layers and, subsequently, a second subset of layers, the first subset of layers of the neural network generating, from the first medical image, a second medical image having a second resolution and the second subset of layers of the neural network generating, from the second medical image, a third medical image having a third resolution, and outputting the third medical image, wherein the first resolution is lower than the second resolution and the second resolution is lower than the third resolution...…...................... Please see Fig. 1-7. Abstract.
CHOI et al. (US 20220122223 A1)- An electronic device includes at least one imaging sensor and at least one processor coupled to the at least one imaging sensor. The at least one imaging sensor is configured to capture a burst of image frames. The at least one processor is configured to generate a low-resolution image from the burst of image frames. The at least one processor is also configured to estimate a blur kernel based on the burst of image frames. The at least one processor is further configured to perform deconvolution on the low-resolution image using the blur kernel to generate a deconvolved image. In addition, the at least one processor is configured to generate a high-resolution image using super resolution (SR) on the deconvolved image.......................... Please see Fig. 3-5. Abstract
ESHET et al. (US 20200249314 A1)- A system and method to use deep learning for super resolution in a radar system include obtaining first-resolution time samples from reflections based on transmissions by a first-resolution radar system of multiple frequency-modulated signals. The first-resolution radar system includes multiple transmit elements and multiple receive elements. The method also includes reducing resolution of the first-resolution time samples to obtain second-resolution time samples, implementing a matched filter on the first-resolution time samples to obtain a first-resolution data cube and on the second-resolution time samples to obtain a second-resolution data cube, processing the second-resolution data cube with a neural network to obtain a third-resolution data cube, and training the neural network based on a first loss obtained by comparing the first-resolution data cube with the third-resolution data cube. The neural network is used with a second-resolution radar system to detect one or more objects..…...................... Please see Para. [0034-0048]. Abstract.
El-Khamy et al. (US 20180293707 A1)- In a method for super resolution imaging, the method includes: receiving, by a processor, a low resolution image; generating, by the processor, an intermediate high resolution image having an improved resolution compared to the low resolution image; generating, by the processor, a final high resolution image based on the intermediate high resolution image and the low resolution image; and transmitting, by the processor, the final high resolution image to a display device for display thereby.…...................... Please see Fig. 2-3 and 5. Abstract.
BERTHELOT et al. (US 20210407042 A1)- Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes receiving a training image and a ground truth super-resolution image; processing a first training network input comprising the training image using the neural network to generate a first training super-resolution image; processing a first critic input generated from (i) the training image and (ii) the ground truth super-resolution image using a critic neural network to map the first critic input to a latent representation; processing a second critic input generated from (i) the training image and (ii) the first training super-resolution image using the critic neural network to map the second critic input to a latent representation; determining a gradient of a generator loss function that measures a distance between the latent representations of the critic inputs; and determining an update to the parameters.....…...................... Please see Fig. 1-3. Abstract.
ZIMMER et al. (US 20200250794 A1)- A microscopy method includes a trained deep neural network that is executed by software using one or more processors of a computing device, the trained deep neural network trained with a training set of images comprising co-registered pairs of high-resolution microscopy images or image patches of a sample and their corresponding low-resolution microscopy images or image patches of the same sample. A microscopy input image of a sample to be imaged is input to the trained deep neural network which rapidly outputs an output image of the sample, the output image having improved one or more of spatial resolution, depth-of-field, signal-to-noise ratio, and/or image contrast......…...................... Please see Para. [0175-0187, 0199-0208] and Fig. 1. Abstract.
TANG et al. (US 20220148130 A1)- Embodiments described herein are generally directed to an end-to-end trainable degradation restoration network (DRN) that enhances the ability of a super-resolution (SR) subnetwork to deal with noisy low-resolution images. An embodiment of a method includes estimating, by a noise estimator (NE) subnetwork of the DRN, an estimated noise map for a noisy input image; and predicting, by the SR subnetwork of the DRN, a clean upscaled image based on the input image and the noise map by, for each of multiple conditional residual dense blocks (CRDBs) stacked within one or more cascade blocks representing the SR subnetwork, adjusting, by a noise control layer of the CRDB that follows a stacked set of a multiple residual dense blocks of the CRDB, feature values of an intermediate feature map associated with the input image by applying (i) a scaling factor and (ii) an offset factor derived from the noise map....…...................... Please see Fig. 19-24. Abstract.
LIU et al. (US 20210327054 A1)- Systems and methods for generating a synthesized medical image are provided. An input medical image is received. A synthesized segmentation mask is generated. The input medical image is masked based on the synthesized segmentation mask. The masked input medical image has an unmasked portion and a masked portion. An initial synthesized medical image is generated using a trained machine learning based generator network. The initial synthesized medical image includes a synthesized version of the unmasked portion of the masked input medical image and synthesized patterns in the masked portion of the masked input medical image. The synthesized patterns is fused with the input medical image to generate a final synthesized medical image........................ Please see Para. [0042-0047 and 0053-0057]. Fig. 1. Abstract (wherein LIU discloses, for example, a GAN, gaussian noise injection, and a weighted fusion of synthetic images).
Any inquiry concerning this communication or earlier communications from the examiner
should be directed to Aaron Bonansinga whose telephone number is (703) 756-5380 The examiner can normally be reached on Monday-Friday, 9:00 a.m. - 6:00 p.m. ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s
supervisor, Chineyere Wills-Burns can be reached by phone at (571) 272-9752. The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AARON TIMOTHY BONANSINGA/Examiner, Art Unit 2673
/CHINEYERE WILLS-BURNS/Supervisory Patent Examiner, Art Unit 2673