DETAILED ACTION
This Office Action is sent in response to Applicant’s Communication received 5/17/2022 for application number 17/746,198.
Claims 1-19 are pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent claims 1, 10, and 19 recite (for representative claim 1).
A method for optimizing generative adversarial network (GAN) comprising: determining a first weight of a generator and a second weight of a discriminator, wherein the first weight is equal to the second weight, the first weight is configured to indicate a learning ability of the generator, the second weight is configured to indicate a learning ability of the discriminator; and alternative iteratively training the generator and the discriminator until the generator and the discriminator are convergent, which causes an Nash equilibrium between the generator and the discriminator and a determination of probability of the discriminator is 0.5; wherein the first weight and the second weight are in positive correlation; the training of the generator is related to a toss function of the generator, a target of the generator is maximizing the loss function of the generator to match generated sample distribution to real sample distribution; the training of the discriminator is related to a loss function of the generator, a target of the discriminator is minimizing the loss function of the discriminator to determine whether an input sample is a real image or an image generated by the generator.
(2A, prong 1) The underlined portions of the claim recite an abstract idea, specifically a mathematical calculation. The underlined portions of claims require setting two weights to be the same, calculating two different loss functions for training (and therefore calculating updates to the weights) by maximizing the loss function of the generator and minimizing the loss function of the discriminator until a Nash equilibrium is reached. (In other words, the Applicant has amended the claim to explicitly include mathematical calculations.)
(2A, prong 2) This judicial exception is not integrated into a practical application. The claims recite the additional elements of [a] “the first weight is configured to indicate a learning ability of the generator, the second weight is configured to indicate a learning ability of the discriminator,” and [b] generic computer components of a memory, processor, and non-transitory medium in claims 10 and 19. Element [a] is a mere instruction to apply the exception because it recites an outcome (that the first and second weights indicate learning abilities) without explaining how the outcome is accomplished (it is unclear how the weights are supposed to indicate learning abilities, and it is not entirely clear what the applicant intends the term “learning ability” to mean). Element [b] is also a mere instruction to apply the exception because it adds generic computer components after the fact to the abstract idea. Even when considered together, the additional elements do not integrate the abstract idea into a practical application because they only add mere instructions to apply the exception to the mathematical calculation.
(2B) The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As explained above, additional elements [a] and [b] are mere instructions to apply the exception. Even considering the claim as a whole with both the additional elements and the mathematical calculation, the additional elements do not amount to significantly more than the abstract idea itself because they are mere instructions to apply the mathematical calculation.
Claims 2 and 11 explicitly recite mathematical equations, and therefore add to the abstract idea.
Claims 3 and 12 recite the models are CNN, RNN, or DNN. (2A, prong 2) this additional element does not integrate the abstract idea into a practical application because it is a mere instruction to apply the exception. This additional element only recites an outcome (that the models are CNN, RNN, or DNN) without explaining how these models function or how they work with the rest of the claim, and the element is only invoking computer to perform an existing process by generally stating the abstract idea is to be performed by existing machine learning models of a CNN, RNN, or DNN.
Claims 4 and 13 recite the weights are initialized through Xavier, Kaiming, Fixup, or LSUVF initialization or transfer learning. These claims add to the mathematical calculation because performing weight initialization any one of Xavier, Kaiming, Fixup, and layer-sequential unit-variance (LSUV) initialization requires calculating a number of mathematical calculations in order to obtain the initial weights.
Claims 5-9 and 14-18 recite the training comprises iteratively training and updating the weights, updating according to a learning rate and the loss function; setting the learning ratio dynamically according to training time, and particular equations for the loss functions. These claims all add either additional steps to calculations performed in the parent claims (for claims 5-7 and 14-16) or explicitly recite a mathematical formula (for claims 8-9 and 17-18).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1, 3-10, 12-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gottlieb (US 11,341,699 B1) in view of Ali et al., Improving Training of Generative Adversarial Networks (see NPL attached to rejection mailed 4/10/2025).
In reference to claim 1, Gottlieb discloses a method for optimizing generative adversarial network (GAN) (method for training GAN, col. 3, lines 11-17) comprising: determining a first weight of a generator and a second weight of a discriminator (weights of generator and discriminator are initialized with Xavier initialization, col. 12, lines 4-26), wherein the first weight is equal to the second weight, the first weight is configured to indicate a learning ability of the generator, the second weight is configured to indicate a learning ability of the discriminator (the Examiner notes that it is not entirely clear what these limitations are supposed to mean. For a first weight being equal to a second weight, it not clear from the specification what specific weights are being referred to, and the specification and dependent claim 4 explicitly state using Xavier, Kaiming, etc. initializations for weights; these initializations take random samples from different distributions, thus, a first and second weight could randomly be equal. For the “learning ability,” this is not a normal term in the art, and this term is not defined by the specification. The weights in Gottlieb would indicate a learning ability to the extent that the loss calculated using the weights can indicate when a minima of loss function has been reached, i.e., how much more training is needed, col. 12, lines 4-26); and alternative iteratively training the generator and the discriminator until the generator and the discriminator are convergent (see fig. 5 and cols. 10-13 which describe the training process: discriminator and then generator are trained alternatively until validation error is no longer decreasing) … wherein the first weight and the second weight are in positive correlation (see note above about the weights; the weights are correlated in the sense that they will both update together to converge, col. 12, lines 4-26); the training of the generator is related to a loss function of the generator (both generator and discriminator’s loss functions are related through the common term D(G(z)) which represents probability of synthetic data being real, col. 11, lines 25-45), a target of the generator is maximizing the loss function of the generator to match generated sample distribution to real sample distribution (generator can maximize loss function; col. 11, lines 43-45); the training of the discriminator is related to a loss function of the generator (both generator and discriminator’s loss functions are related through the common term D(G(z)) which represents probability of synthetic data being real, col. 11, lines 25-45), a target of the discriminator is minimizing the loss function of the discriminator to determine whether an input sample is a real image or an image generated by the generator (discriminator minimizes loss, col. 13, lines 12-14).
However, Gottlieb does not explicitly teach a Nash equilibrium between the generator and the discriminator and a determination of probability of the discriminator is 0.5.
Ali teaches a Nash equilibrium between the generator and the discriminator and a determination of probability of the discriminator is 0.5 (page 82, first paragraph – generator and discriminator reach Nash equilibrium so that the generator can deceive the discriminator).
It would have been obvious to one of ordinary skill in art, having the teachings of Gottlieb and Ali before the earliest effective filing date, to modify the training as disclosed by Gottlieb to include the Nash equilibrium of Ali.
One of ordinary skill in the art would have been motivated to modify the training of Gottlieb to include the Nash equilibrium of Ali because the Nash Equilibrium of Ali is the standard way of determining training has completed for a GAN (Ali, page 82, first column).
In reference to claim 3, Gottlieb discloses the method according to claim 2, wherein the generator and the discriminator are both neural networks, the neural network includes at least one of convolutional neural networks (CNN), recurrent neural network (RNN) and deep neural networks (DNN) (DNN, col. 11, line 11).
In reference to claim 4, Gottlieb discloses the method according to claim 3, wherein the determining a first weight of a generator and a second weight of a discriminator by at least one of Xavier initialization, Kaiming initialization, Fixup initialization, LSUV initialization, and transfer learning (Xavier initialization, col. 12, lines 4-26).
In reference to claim 5, Gottlieb discloses the method according to claim 3, wherein the alternative iteratively training the generator and the discriminator further comprises: training the generator and updating the first weight; and training the discriminator and updating the second weight (see fig. 5 and cols. 10-13 which describe the training process: discriminator and generator are trained and first weights in generator and second weights in discriminator are updated according to loss).
In reference to claim 6, Gottlieb discloses the method according to claim 5, wherein the updating of the first weight is related to a learning ratio and a loss function of the generator, the updating of the second weight is related to a learning ratio and a loss function of the discriminator; the learning ratio is an upgrade range of corresponding weight (Gottleib teaches adjusting the first and second weights based on a loss function and a learning rate, which controls how far the weight is updated, col. 12, lines 4-49).
In reference to claim 7, Gottlieb discloses the method according to claim 6, wherein the learning ratio is dynamically set according to training times (learning rate changes dynamically during training to balance training time and accuracy, col. 12, lines 35-49).
In reference to claim 8, Gottlieb does not explicitly teach the method according to claim 6, wherein the loss function of the generator is
L
g
=
-
∇
θ
g
1
m
∑
i
=
1
m
l
o
g
(
1
-
D
G
z
(
i
)
)
wherein m means a quantity of the noise sample z;
z
(
i
)
means an ith noise sample;
D
G
z
(
i
)
means a probability of determining the image being true;
θ
g
means the first weight.
Ali teaches the method according to claim 6, wherein the loss function of the generator is
L
g
=
-
∇
θ
g
1
m
∑
i
=
1
m
l
o
g
(
1
-
D
G
z
(
i
)
)
wherein m means a quantity of the noise sample z;
z
(
i
)
means an ith noise sample;
D
G
z
(
i
)
means a probability of determining the image being true;
θ
g
means the first weight (see equation for generator in the first column, page 82).
It would have been obvious to one of ordinary skill in art, having the teachings of Gottlieb and Ali before the earliest effective filing date, to modify the loss function as disclosed by Gottlieb to include the loss function of Ali.
One of ordinary skill in the art would have been motivated to modify the loss function of Gottlieb to include the loss function of Ali because the loss function of Ali is the one used to train standard GANs (Ali, page 82, first column).
In reference to claim 9, Gottlieb does not explicitly teach the method according to claim 8, wherein the loss function of the discriminator is
L
d
=
∇
θ
d
1
m
∑
i
=
1
m
[
log
D
x
i
+
log
1
-
D
G
z
i
]
wherein
x
i
means an ith real image;
D
x
i
means a probability of determining the real image
x
i
being true;
θ
d
means the second weight.
Ali teaches the method according to claim 8, wherein the loss function of the discriminator is
L
d
=
∇
θ
d
1
m
∑
i
=
1
m
[
log
D
x
i
+
log
1
-
D
G
z
i
]
wherein
x
i
means an ith real image;
D
x
i
means a probability of determining the real image
x
i
being true;
θ
d
means the second weight (see equation for discriminator in the first column, page 82).
It would have been obvious to one of ordinary skill in art, having the teachings of Gottlieb and Ali before the earliest effective filing date, to modify the loss function as disclosed by Gottlieb to include the loss function of Ali.
One of ordinary skill in the art would have been motivated to modify the loss function of Gottlieb to include the loss function of Ali because the loss function of Ali is the one used to train standard GANs (Ali, page 82, first column).
In reference to claim 10, this claim is directed to an apparatus associated with the method claimed in claim 1 and is therefore rejected under a similar rationale.
In reference to claim 12, this claim is directed to an apparatus associated with the method claimed in claim 3 and is therefore rejected under a similar rationale.
In reference to claim 13, this claim is directed to an apparatus associated with the method claimed in claim 4 and is therefore rejected under a similar rationale.
In reference to claim 14, this claim is directed to an apparatus associated with the method claimed in claim 5 and is therefore rejected under a similar rationale.
In reference to claim 15, this claim is directed to an apparatus associated with the method claimed in claim 6 and is therefore rejected under a similar rationale.
In reference to claim 16, this claim is directed to an apparatus associated with the method claimed in claim 7 and is therefore rejected under a similar rationale.
In reference to claim 17, this claim is directed to an apparatus associated with the method claimed in claim 8 and is therefore rejected under a similar rationale.
In reference to claim 18, this claim is directed to an apparatus associated with the method claimed in claim 7 and is therefore rejected under a similar rationale.
In reference to claim 19, this claim is directed to a non-transitory computer-readable medium associated with the method claimed in claim 1 and is therefore rejected under a similar rationale.
Claim(s) 2 and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gottlieb (US 11,341,699 B1) in view of Ali et al., Improving Training of Generative Adversarial Networks (see NPL attached to rejection mailed 4/10/2025) as applied to claims 1 and 10 above, and in further view of Voinea et al. (US 11,797,705 B1).
In reference to claim 2, Gottlieb and Ali do not explicitly teach the method according to claim 1, wherein a formula of an output of the neural network is:
y
=
f
3
W
3
*
f
2
W
2
*
f
1
W
2
*
x
wherein y means the output of the neural network, x means data sample,
f
1
z
1
,
f
2