Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 30 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding Claim 30, it recites “the loss function” (line 2). This term lacks antecedent basis because there has been no previous mention of a loss function with respect to the noise schedule. In addition, line 3 recites “gradients of the other parameters of the diffusion model.” It is unclear which of the “one or more learned parameters values” of claim 21 are considered “the other parameters.” Also, there has been no previous indication that any parameters have gradients.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-4 and 7-9 are rejected under 35 U.S.C. 103 as being unpatentable over Song, Yang, et al. (“Score-based generative modeling through stochastic differential equations,” arXiv preprint arXiv:2011.13456 (2020); hereinafter “Song”) in view of Ho, Jonathan, Ajay Jain, and Pieter Abbeel (“Denoising diffusion probabilistic models,” Advances in neural information processing systems 33 (2020): 6840-6851 (hereinafter “Ho”).
Regarding Claim 1, Song teaches a computing system that leverages Fourier features for improved fine scale prediction (sections 1 and H.2, second-fourth paragraphs), comprising:
one or more processors (section 1—the model and calculations are clearly run on one or more processors); and
one or more non-transitory, computer-readable media (computers inherently store instructions and data on computer-readable media) that collectively store:
at least a denoising model of a machine-learned diffusion model (section 1 and fig. 1—the score-based generative model is a denoising model of a machine-learned diffusion model), the diffusion model comprising:
a noising model comprising a plurality of noising stages, the noising model configured to receive input data and produce latent data in response to receipt of the input data (figs. 1 and 2; sections 1 and 3.1—the SDE adds noise in a plurality of stages to produce latent data); and
the denoising model configured to reconstruct output data from the latent data (figs. 1 and 2; sections 1 and 3.2—reversing the SDE reconstructs output from the latent data);
wherein input to the denoising model comprises a set of Fourier features comprising a linear projection of channels of at least one stage of the plurality of noising stages (section H.2—Fourier feature embeddings are used as inputs to the model. The scale parameter of 16 indicates a linear projection); and
instructions that, when executed by the one or more processors, cause the computing system to execute the denoising model to process the latent data to generate the output data (section 3.2—obtaining samples from the reverse process generates the output data by executing the denoising model to process the latent data. See also fig. 12 and section H.3).
Song does not specifically teach wherein the latent data comprises a compressed representation of the input data and the output data comprises a decompressed representation of the input data. However, Ho teaches wherein latent data comprises a compressed representation of input data and output data comprises a decompressed representation of the input data (section 2 describes a diffusion model that generates latent data from image input data. Section 3 further details the diffusion model as including a noising model that generates the latent data in a forward process and a denoising model that generates an image from the latent data in a reverse process. Section 4.3 describes an embodiment the diffusion model functions as an excellent lossy compressor in which the latent data comprises a compressed representation of the input data and the reverse process reconstructs an original image by decompressing the latent data to produce output data).
All of the claimed elements were known in Song and Ho and could have been combined by known methods with no change in their respective functions. It therefore would have been obvious to a person of ordinary skill in the art at the time of filing of the applicant’s invention to combine the compressed representation of Ho with the latent data of Song to yield the predictable result of wherein the latent data comprises a compressed representation of the input data and the output data comprises a decompressed representation of the input data. One would be motivated to make this combination for the purpose of ensuring accessibility of the internet to wide audiences as data becomes higher resolution (Ho, section 6).
Regarding Claim 2, Song/Ho teaches wherein the set of Fourier features comprises a linear projection of the channels of each of the plurality of noising stages (Song, sections H.2, H.3, and I.1—the details of the Fourier features input to the model are a matter of design choice, as evidenced by the various examples and experiments of Song. In the present claim, they serve no function, as the model does not perform specific operations or produce specific output using the features).
Regarding Claim 3, Song/Ho teaches wherein the set of Fourier features comprises a linear projection of at least one stage of the plurality of noising stages onto a set of periodic basis functions with high frequency (Song, sections H.2, H.3, and I.1—the details of the Fourier features input to the model are a matter of design choice, as evidenced by the various examples and experiments of Song. In the present claim, they serve no function, as the model does not perform specific operations or produce specific output using the features).
Regarding Claim 4, Song/Ho teaches wherein the set of Fourier features comprises four channels (Song, sections H.2, H.3, and I.1—the details of the Fourier features input to the model are a matter of design choice, as evidenced by the various examples and experiments of Song. In the present claim, they serve no function, as the model does not perform specific operations or produce specific output using the features).
Regarding Claim 7, Song/Ho teaches wherein the input data comprises a bit length, and wherein the set of Fourier features comprises Fourier features having each frequency from one to the bit length (Song, sections H.2, H.3, and I.1—the details of the Fourier features input to the model are a matter of design choice, as evidenced by the various examples and experiments of Song. In the present claim, they serve no function, as the model does not perform specific operations or produce specific output using the features).
Regarding Claim 8, Song/Ho teaches wherein the input data comprises a bit length of eight or greater, and wherein the set of Fourier features comprises Fourier features having each frequency from seven to the bit length (Song, sections H.2, H.3, and I.1—the details of the Fourier features input to the model are a matter of design choice, as evidenced by the various examples and experiments of Song. In the present claim, they serve no function, as the model does not perform specific operations or produce specific output using the features).
Regarding Claim 9, Song/Ho teaches wherein the input data comprises image data (Song, section 1 and figs. 1 and 4).
Allowable Subject Matter
Claims 21-27 and 31-33 are allowed. As described in the first Office Action, Kong, Zhifeng, et al. (“Diffwave: A versatile diffusion model for audio synthesis,” arXiv preprint arXiv:2009.09761 (2020)) teaches most of the limitations of independent claims 21 and 47, but does not teach “wherein the noise schedule is a learned noise schedule that comprises one or more learned parameter values.” Kong uses a predefined, fixed noise schedule. None of the prior art of record teaches a learned noise schedule. Claims 22-27 and 31-33 are allowable by virtue of their dependence on claim 21.
Claim 30 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims. Claim 30 contains allowable subject matter by virtue of its dependence on allowed claim 21, but recites some indefinite terms, so it remains rejected under 35 U.S.C. 112(b) as detailed above.
Response to Arguments
The amendments to claims 21-27 and 30-33 are accepted as overcoming the previous rejections under 35 U.S.C. 101.
The amendments to the claims overcome the previous rejections under 35 U.S.C. 112(b) of claims 47-50. The amendments address one of the indefinite terms of claim 30, but several terms of claim 30 still lack antecedent basis, so they remain rejected as detailed above.
Applicant’s arguments with respect to claims 1-4 and 7-9 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Although Song does not teach the amended limitations of claim 1, Ho (cited in the first Office Action) teaches the amended limitations, as detailed above.
Conclusion
Applicant’s amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HAL W SCHNEE whose telephone number is (571) 270-1918. The examiner can normally be reached M-F 7:30 a.m. - 6:00 p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached at 303-297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/HAL SCHNEE/Primary Examiner, Art Unit 2129