DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.
Drawings
The applicant’s submitted drawings appear to be acceptable for examination purposes. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the drawings.
Information Disclosure Statement
As required by M.P.E.P. 609(c), the applicant's submission of the Information Disclosure Statement, dated 5 April 2024, is acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending. As required by M.P.E.P 609 C(2), a copy of the PTOL-1449 initialed and dated by the examiner is attached to the instant office action.
Claim Objections
Claim 3 is objected to because of the following informalities: “a different in values” appears as though it should be “a difference in values” or similar. Appropriate correction is required.
Claim 6 depends upon claim 3, and thus includes the aforementioned limitation(s). Claim 6 is also objected to because of the following informalities: “wherein the stop condition comprising” appears as though it should be “wherein the stop condition comprises” or similar. Appropriate correction is required.
Claim 11 is objected to because of the following informalities: “for model used in the Langevin flow” appears as though it should be “for the model used in the Langevin flow” or similar. Appropriate correction is required.
Claim 12 is objected to because of the following informalities: “determining a different or differences” appears as though it should be “determining a difference or differences” or similar. Appropriate correction is required.
Claim 18 is objected to because of the following informalities: “for model used in the Langevin flow” appears as though it should be “for the model used in the Langevin flow” or similar. Appropriate correction is required.
Claim 19 is objected to because of the following informalities: “determining a different or differences” appears as though it should be “determining a difference or differences” or similar. Appropriate correction is required.
A series of singular dependent claims is permissible in which a dependent claim refers to a preceding claim which, in turn, refers to another preceding claim.
A claim which depends from a dependent claim should not be separated by any claim which does not also depend from said dependent claim (see, e.g., claim 6). It should be kept in mind that a dependent claim may refer to any preceding independent claim. In general, applicant's sequence will not be changed. See MPEP § 608.01(n).
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-7 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
As per claim 1, the intended scope of the claim is not clear because it is not clear what is meant by a “[normal] distribution.” For the purposes of examination, the examiner has assumed that this is a normal distribution, but the intended purpose/meaning of the brackets is not clear.
Claims 2-7 depend upon claim 1, and thus include the aforementioned limitation(s).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1 and 3-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Nijkamp et al. (MCMC Should Mix: Learning Energy-Based Model with Neural Transport Latent Space MCMC, March 2022, pgs. 1-18) in view of Papamakarios et al. (Normalizing Flows for Probabilistic Modeling and Inference, April 2021, pgs. 1-64).
As per claim 1, Nijkamp teaches a computer-implemented method comprising: initializing flow parameters for a flow neural network and energy-based model parameters for an energy-based model (EBM) [learning an energy-based model (EBM) with a flow-based model serving as a backbone, so that the EBM is a correction or an exponential tilting of the flow-based model; where the mapping in the flow-based model is deterministic and one-one, with closed-form inversion and Jacobian that can be efficiently computed, which leads to an explicit normalized density via change of variable; using a relatively simple energy function parameterized by a free-form ConvNet (pgs. 1-2, abstract and section 1; etc.); which can use a neural network or pretrained general latent variable model as the flow-based model (pg. 5, section 3.4; pg. 6, sections 3.5-3.6; etc.); and where the EBM network parameters are initialized with Xavier (pg. 16, A.4; etc.); where pretraining the flow-based neural network model is initializing the flow parameters for the flow-based neural network, and initializing the EBM network parameters is initializing the energy-based model parameters for the EBM]; and performing a set of steps until a stop condition is reached [training is performed for T learning iterations, with K MCMC steps in each learning iteration (pg. 5, Algorithm 1; etc.); where the set number of iterations/steps is a stop condition being reached], the set of steps comprising: for each training signal sampled from an unknown data distribution: generating an initial signal sampled from a [normal] distribution [For an unnormalized target distribution, the neural transport sampler trains a flow-based model as a variational approximation to the target distribution and then samples the target distribution in the space of latent variables of the flow-based model via change of variable. In the latent space, the target distribution is close to the prior distribution of the latent variables of the flow-based model, which is usually a unimodal Gaussian white noise distribution (pg. 3, section 2; etc.); where the Gaussian distribution is a [normal] distribution]; transforming the initial signal using a flow neural network to obtain a flow-generated signal [For an unnormalized target distribution, the neural transport sampler trains a flow-based model as a variational approximation to the target distribution and then samples the target distribution in the space of latent variables of the flow-based model via change of variable. In the latent space, the target distribution is close to the prior distribution of the latent variables of the flow-based model, which is usually a unimodal Gaussian white noise distribution (pg. 3, section 2; etc.); where the flow-based model generates the flow-generated signal via change of variable]; and generating, via a Markov chain Monte Carlo (MCMC) sampling process, a synthesized signal using the EBM [The EBM can generate synthesized examples from pθ, using gradient-based MCMC sampling, such as Langevin dynamics (pgs. 4-5, section 3.2; etc.)] and using the flow-generated signal as an initial starting point for the MCMC sampling process [Instead of using uniform or Gaussian white noise distribution for the reference distribution (starting point) in the EBM, we can use the relatively simple flow-based model as the reference model (pg. 5, section 3.3; etc.); where the flow-based model is producing the flow-generated signal (see above)]; using a set of synthesized signals and a set of flow-generated signals corresponding to the set of synthesized signals to update the flow parameters for the normalizing flow neural network; and updating the energy-based model parameters for the energy-based model by using a comparison comprising the set of synthesized signals and a set of training signals corresponding to the set of synthesized signals [the parameters of the models are updated over a T learning iterations according to equation (6), which includes a difference (comparison) between the set of synthesized examples xi- and a set of training examples xi (pg. 4, section 3.2; pg. 5, section 3.4 and Algorithm 1; etc.), where equation (6) shows how the comparison is made and Algorithm 1 describes updating the (flow and EBM) parameters of the models using this equation/comparison].
While Nijkamp teaches updating flow parameters of a flow-based neural network to generate flow-based signals (see above), it has not been relied upon for teaching updating normalizing flow neural network.
Papamakarios teaches updating normalizing flow parameters for a normalizing flow neural network and transforming the initial signal using a normalizing flow neural network to obtain a normalized flow-generated signal [Normalizing flows provide a general mechanism for defining expressive probability distributions, only requiring the specification of a (usually simple) base distribution and a series of bijective transformations (pg. 1, abstract; etc.) with a flow-based neural network (pg. 3, section 2.1; etc.) and training the flow parameters of the generator neural network using adversarial training or other approaches (pg. 9, section 2.3; etc.); which is thus a normalizing flow neural network, for the flow-based neural network of Nijkamp, above].
Nijkamp and Papamakarios are analogous art, as they are within the same field of endeavor, namely training models including flow-based neural network models.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to utilize a normalizing flow neural network with normalizing flow parameters/signals, as taught by Papamakarios, as the flow-based neural network and flow parameters/signals in the training of the flow-based neural network and EBM in the system taught by Nijkamp.
Papamakarios provides motivation as [Normalizing flows provide a general mechanism for defining expressive probability distributions, only requiring the specification of a (usually simple) base distribution and a series of bijective transformations, and providing expressive power and computational trade-offs (pg. 1, abstract; etc.)].
As per claim 3, Nijkamp/Papamakarios teaches wherein the step of updating the energy-based model parameters for the energy-based model by using a comparison comparing the set of synthesized signals and a set of training signals corresponding to the set of synthesized signals comprises: determining a learning gradient comprising a different in values obtained using values from the EBM given the set of training signals as inputs and values from the EBM given the set of synthesized signals as inputs to the EBM [the parameters of the models are updated over a T learning iterations according to equation (6), which includes a difference between outputs, using the set of synthesized examples xi- and a set of training examples xi as inputs (Nijkamp: pg. 4, section 3.2; pg. 5, section 3.4 and Algorithm 1; etc.), where equation (6) shows how the difference(s) calculation(s) is made and Algorithm 1 describes updating the (flow and EBM) parameters of the models using this equation/comparison].
As per claim 4, Nijkamp/Papamakarios teaches wherein: the set of training signals represents a set of training images; and the set of synthesized signals represents a set of synthesized images [the inputs to the models can be images (Nijkamp: pg. 1, section 1; pg. 3, section 2; pg. 4, section 3.1; etc.); including the training examples and synthesized examples (Nijkamp: pgs. 4-5, sections 3.2 and 3.4; etc.)].
As per claim 5, Nijkamp/Papamakarios teaches, responsive to a stop condition being reached, outputting a final version of the normalizing flow parameters for the normalizing flow neural network and a final version of the energy-based model parameters for the energy-based model [training is performed for T learning iterations, with K MCMC steps in each learning iteration (Nijkamp: pg. 5, Algorithm 1; etc.); where the set number of iterations/steps is a stop condition being reached, which produces the final versions].
As per claim 6, Nijkamp/Papamakarios teaches wherein the stop condition comprising an iteration number having been met, a processing time having been met, an amount of data processing having been met, a number of processing iterations having been met, or a convergence condition or conditions having been met [training is performed for T learning iterations, with K MCMC steps in each learning iteration (Nijkamp: pg. 5, Algorithm 1; etc.); where the set number of iterations/steps is an iteration number having been met].
As per claim 7, Nijkamp/Papamakarios teaches wherein the MCMC sampling process is an iterative process with a finite number of Langevin steps of a Langevin flow [training is performed for T learning iterations, with K MCMC steps in each learning iteration (Nijkamp: pg. 5, Algorithm 1; etc.) using Langevin dynamics for the MCMC sampling (Nijkamp: pg. 5, section 3.3; etc.); where the K steps is a finite number of Langevin steps of a Langevin flow (dynamics)].
As per claim 8, Nijkamp teaches a computer-implemented method comprising: generating a set of initial signals, which are sampled from a distribution [For an unnormalized target distribution, the neural transport sampler trains a flow-based model as a variational approximation to the target distribution and then samples the target distribution (initial signals) in the space of latent variables of the flow-based model via change of variable. In the latent space, the target distribution is close to the prior distribution of the latent variables of the flow-based model, which is usually a unimodal Gaussian white noise distribution (pg. 3, section 2; etc.)]; transforming the initial signals by flow using a flow neural network comprising flow parameters to obtain a set of flow-generated signals corresponding to the set of initial signals [For an unnormalized target distribution, the neural transport sampler trains a flow-based model as a variational approximation to the target distribution and then samples the target distribution in the space of latent variables of the flow-based model via change of variable. In the latent space, the target distribution is close to the prior distribution of the latent variables of the flow-based model, which is usually a unimodal Gaussian white noise distribution (pg. 3, section 2; etc.); which can use a neural network or pretrained general latent variable model as the flow-based model (pg. 5, section 3.4; pg. 6, sections 3.5-3.6; etc.); where the flow-based model generates the flow-generated signal via the change of variable]; for each flow-generated signal of the set of flow-generated signals, generating a synthesized signal by performing a Langevin flow that is initialized with the flow-generated signal [the EBM can generate synthesized examples from pθ, using gradient-based MCMC sampling, such as Langevin dynamics (pgs. 4-5, section 3.2; etc.) and instead of using uniform or Gaussian white noise distribution for the reference distribution (starting point) in the EBM, we can use the relatively simple flow-based model as the reference model (pg. 5, section 3.3; etc.); where the EBM using Langevin dynamics (flow) is the performing a Langevin flow (initialized with the flow-generated signal from the flow-based model)]; updating of the flow parameters of the flow neural network by treating the synthesized signals generated by the Langevin flow as training data; and updating of the Langevin flow according to a learning gradient of a model used in the Langevin flow using the synthesized signals generated and a set of observed signals [the parameters of the models are updated over a T learning iterations according to equation (6), which includes a difference (comparison) between the set of synthesized examples xi- and a set of training examples xi (pg. 4, section 3.2; pg. 5, section 3.4 and Algorithm 1; etc.), where equation (6) shows how the comparison is made and Algorithm 1 describes updating the (flow and EBM) parameters of the models using this equation/comparison].
While Nijkamp teaches updating flow parameters of a flow-based neural network to generate flow-based signals (see above), it has not been relied upon for teaching updating normalizing flow neural network.
Papamakarios teaches updating normalizing flow parameters for a normalizing flow neural network and transforming the initial signal using a normalizing flow neural network to obtain a normalized flow-generated signal [Normalizing flows provide a general mechanism for defining expressive probability distributions, only requiring the specification of a (usually simple) base distribution and a series of bijective transformations (pg. 1, abstract; etc.) with a flow-based neural network (pg. 3, section 2.1; etc.) and training the flow parameters of the generator neural network using adversarial training or other approaches (pg. 9, section 2.3; etc.); which is thus a normalizing flow neural network, for the flow-based neural network of Nijkamp, above].
Nijkamp and Papamakarios are analogous art, as they are within the same field of endeavor, namely training models including flow-based neural network models.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to utilize a normalizing flow neural network with normalizing flow parameters/signals, as taught by Papamakarios, as the flow-based neural network and flow parameters/signals in the training of the flow-based neural network and EBM in the system taught by Nijkamp.
Papamakarios provides motivation as [Normalizing flows provide a general mechanism for defining expressive probability distributions, only requiring the specification of a (usually simple) base distribution and a series of bijective transformations, and providing expressive power and computational trade-offs (pg. 1, abstract; etc.)].
As per claim 9, Nijkamp/Papamakarios teaches wherein the model for the Langevin flow is an energy-based model [the EBM can generate synthesized examples from pθ, using gradient-based MCMC sampling, such as Langevin dynamics (Nijkamp: pgs. 4-5, section 3.2; etc.)].
As per claim 10, Nijkamp/Papamakarios teaches wherein the steps of Claim 8 represent an iteration and the method further comprises: repeating the steps of Claim 8 for a set of iterations until a stop condition is reached [training is performed for T learning iterations, with K MCMC steps in each learning iteration (Nijkamp: pg. 5, Algorithm 1; etc.)].
As per claim 11, Nijkamp/Papamakarios teaches, responsive to a stop condition being reached, outputting a final version of the normalizing flow parameters for the normalizing flow neural network and a final version of parameters for model used in the Langevin flow [training is performed for T learning iterations, with K MCMC steps in each learning iteration (Nijkamp: pg. 5, Algorithm 1; etc.); where the set number of iterations/steps is a stop condition being reached, which produces the final versions].
As per claim 12, Nijkamp/Papamakarios teaches wherein the learning gradient of the model used in the Langevin flow is obtained by performing steps comprising: determining a different or differences in values obtained using values from the model given the set of training signals as inputs and values from the model given the set of synthesized signals as inputs to the model [the parameters of the models are updated over a T learning iterations according to equation (6), which includes a difference between outputs, using the set of synthesized examples xi- and a set of training examples xi as inputs (Nijkamp: pg. 4, section 3.2; pg. 5, section 3.4 and Algorithm 1; etc.), where equation (6) shows how the difference(s) calculation(s) is made and Algorithm 1 describes updating the (flow and EBM) parameters of the models using this equation/comparison].
As per claim 13, see the rejection of claim 4, above.
As per claim 14, Nijkamp/Papamakarios teaches wherein a normalizing flow neural network is pretrained [using a relatively simple energy function parameterized by a free-form ConvNet (Nijkamp: pgs. 1-2, abstract and section 1; etc.); which can use a neural network or pretrained general latent variable model as the flow-based model (Nijkamp: pg. 5, section 3.4; pg. 6, sections 3.5-3.6; etc.)].
As per claim 15, see the rejection of claim 8, above, wherein Nijkamp/Papamakarios also teaches a system comprising: one or more processors; and a non-transitory computer-readable medium or media comprising one or more sets of instructions which, when executed by at least one of the one or more processors, causes steps to be performed comprising: [the method] [experiments are performed using the models on multiple image sets (pg. 7, figs. 3-4, pg. 8, fig. 6; etc.); which requires instructions executed, from memory, by one or more processors (as otherwise images could not be generated)].
As per claim 16, see the rejection of claim 9, above.
As per claim 17, see the rejection of claim 10, above.
As per claim 18, see the rejection of claim 11, above.
As per claim 19, see the rejection of claim 12, above.
As per claim 20, see the rejection of claim 13, above.
Claim(s) 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Nijkamp and Papamakarios as applied to claim 1 above, and further in view of Evans (US 11,113,632).
As per claim 2, Nijkamp/Papamakarios teaches the computer-implemented method of claim 1, as described above.
While Nijkamp/Papamakarios teaches updating the normalizing flow parameters for the normalizing flow neural network over a number of iterations (see above), it has not been relied upon for teaching wherein updating the normalizing flow parameters for the normalizing flow neural network is performed via gradient ascent.
Evans teaches wherein updating the normalizing flow parameters for the normalizing flow neural network is performed via gradient ascent [gradient-based optimization techniques, including stochastic gradient ascent, can be used to optimize the model parameters during training iterations until a predetermined termination condition is reached (col. 6, line 54 to col. 7, line 31; col. 7, lines 48-54; etc.), including for neural networks for normalizing flows (col. 5, lines 47-53; etc.)].
Nijkamp/Papamakarios and Evans are analogous art, as they are within the same field of endeavor, namely training learning models including neural networks and normalizing flows.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to use gradient ascent to update the parameters of the neural network in an iterative training process until a termination condition, as taught by Evans, for the normalizing flow parameters of the normalizing flow neural network trained over a set number of training iterations in Nijkamp/Papamakarios.
Evans provides motivation as [These computational strategies enable the use of gradient-based optimization techniques (such as stochastic gradient ascent) to be employed to maximize the ELBO with respect to parameters (such as variational parameters), and thus perform variational inference. (col. 6, line 54 to col. 7, line 31; etc.); for the variational inference of the normalizing flow neural network of Nijkamp/Papamakarios (see, e.g., Nijkamp: pg. 5, section 3.4; etc.)].
Conclusion
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P. 707.07(i): claims 1-20 are rejected.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Vahdat (US 12,249,048) – discloses score based generative modeling in a latent space including a normalizing flow decoder network.
Xie et al. (Cooperative Learning of Energy-Based Model and Latent Variable Model via MCMC Teaching, 2018, pgs. 4292-4301 – cited in an IDS) – discloses training an EBM (with MCMC) and latent variable model together.
Nijkamp et al. (Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model, 2019, pgs. 1-11 – cited in an IDS) – discloses learning an EBM using a fixed number of MCMC steps.
Du et al. (Improved Contrastive Divergence Training of Energy-Based Model, June 2021, pgs. 1-16 – cited in an IDS) – discloses training an EBM using contrastive divergence.
Rezende et al. (Variational Inference with Normalizing flows, 2015, pgs. 1-9) – discloses using normalizing flow approximation for variational inference.
The examiner requests, in response to this Office action, that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections. See 37 CFR 1.111(c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEORGE GIROUX whose telephone number is (571)272-9769. The examiner can normally be reached M-F 10am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached at 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/GEORGE GIROUX/Primary Examiner, Art Unit 2128