DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending in the application.
CLAIM INTERPRETATION
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. The following table lists the occurrences that use means and corresponding structure and associated algorithm.
Claim no.
112(f) elements
Corresponding structure
Associated algorithm
17
means for generating a gradient
FIG. 1 CPU 102, GPU 104, DSP 106, and/or NPU 108; FIG. 11 processor 1110; para. [0008], [0010], [0082]
FIG. 7 “Algorithm 1”; para. [0072]-[0074]
17
means for combining the gradient
FIG. 1 CPU 102, GPU 104, DSP 106, and/or NPU 108; FIG. 11 processor 1110; para. [0008], [0010], [0082]
FIG. 7 “Algorithm 1”; para. [0072]-[0074]
17
means for predicting … a new sample
FIG. 1 CPU 102, GPU 104, DSP 106, and/or NPU 108; FIG. 11 processor 1110; para. [0008], [0010], [0082]
FIG. 7 “Algorithm 1”; para. [0072]-[0074]
19
means for generating a prediction
FIG. 1 CPU 102, GPU 104, DSP 106, and/or NPU 108; FIG. 11 processor 1110; para. [0008], [0010], [0082]
FIG. 7 “Algorithm 1”; para. [0072]-[0074]
20
means for combining a respective gradient
FIG. 1 CPU 102, GPU 104, DSP 106, and/or NPU 108; FIG. 11 processor 1110; para. [0008], [0010], [0082]
FIG. 7 “Algorithm 1”; para. [0072]-[0074]
20
means for generating… a respective new sample
FIG. 1 CPU 102, GPU 104, DSP 106, and/or NPU 108; FIG. 11 processor 1110; para. [0008], [0010], [0082]
FIG. 7 “Algorithm 1”; para. [0072]-[0074]
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1, 4-5, 8-9, 12-13, 16-17 and 19-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Dhariwal et al. (Dhariwal P, Nichol A. Diffusion Models Beat GANs on Image Synthesis. arXiv preprint arXiv:2105.05233. 2021 May 11, hereafter Dhariwal).
As per claim 1, Dhariwal teaches an apparatus (page 17 Table 7 3rd – 5th column; page 17 section “A.1 throughput” (see captured image below)
PNG
media_image1.png
109
764
media_image1.png
Greyscale
for providing test-time self-supervised guidance for a diffusion machine learning model, comprising:
at least one memory (See above “GPU memory”); and
at least one processor coupled to the at least one memory (Dhariwal teaches a computer-implemented method in which the method is performed by GPU (page 17 section “A.1 throughput” and Table 7). Therefore the coupling between the at least one processor and a memory is inherently taught.), the at least one processor configured to:
generate a gradient associated with a current sample (See below captured picture from page 6 section 4 “Classifier Guidance” second paragraph, in which xt is a current sample.
PNG
media_image2.png
88
765
media_image2.png
Greyscale
combine the gradient with an iterative model estimated score function or data associated with the current sample to generate a score function estimate (See below captured picture from page 7 top table, in which in the expression
PNG
media_image3.png
31
253
media_image3.png
Greyscale
the gradient is combined with an iterative model estimated score function
PNG
media_image4.png
26
17
media_image4.png
Greyscale
. Note Dhariwal’s diffusion model is an iterative model (Abstract; page 7 top Table “Algorithm 1” (“for all t from T to 1”); page 9 Table 4 caption: Effect of classifier guidance on sample quality. Both conditional and unconditional models were trained for 2M iterations on ImageNet 256X256 with batch size 256)); and
PNG
media_image5.png
242
785
media_image5.png
Greyscale
predict, using the diffusion machine learning model and based on the score function estimate, a new sample (See above captured picture showing “Algorithm 1”. The generated sample xt-1 is a new sample).
As per claim 4, dependent upon claim 1, Dhariwal teaches the gradient is a classifier gradient generated using a trained classifier (See below captured picture from page 6 section 4 “Classifier Guidance” second paragraph),
PNG
media_image6.png
114
782
media_image6.png
Greyscale
and wherein the at least one processor coupled to the at least one memory is further configured to:
generate, using the trained classifier, a prediction of a class label associated with the current sample, wherein the classifier gradient is based on the prediction of the class label (See above captured picture, in which the gradient
PNG
media_image7.png
37
147
media_image7.png
Greyscale
is based on the prediction of the class label y).
As per claim 5, dependent upon claim 4, Dhariwal teaches the trained classifier is modified to include at least one loss function or energy function to provide a gradient for being combined with a diffusion model intermediate prediction (See below para. in page 7.
PNG
media_image8.png
206
784
media_image8.png
Greyscale
So the trained classifier is modified to include an energy function “g” to provide a gradient for being combined with a diffusion model intermediate prediction (page 7 Algorithm 2 see below).
PNG
media_image9.png
248
786
media_image9.png
Greyscale
As per claim 8, dependent upon claim 1, Dhariwal teaches the at least one processor coupled to the at least one memory is further configured to:
combine a respective gradient with respective iterative model data for each reverse diffusion sampling step of the diffusion machine learning model to generate respective combined data for each reverse diffusion sampling step (See below captured picture “Algorithm 1” from page 7, which shows a reverse diffusion sampling process. Algorithm 1 is performed in an iterative manner (“for all t from T to 1”), and for each step, a respective gradient (gradient based on current sample xt) is combined with respective iterative model data
PNG
media_image4.png
26
17
media_image4.png
Greyscale
); and
PNG
media_image5.png
242
785
media_image5.png
Greyscale
generate, via the diffusion machine learning model for each reverse diffusion sampling step using the respective combined data, a respective new sample from a respective current sample (See above “Algorithm 1”. In each step t, a respective new sample xt-1 is generated).
As per claim 9, an independent claim, Dhariwal teaches a method (Abstract; page 7 top table “Algorithm 1”) of providing test-time self-supervised guidance for a diffusion machine learning model, comprising:
generating a gradient associated with a current sample (See mapping of similar limitations applied to claim 1);
combining the gradient with an iterative model estimated score function or data associated with the current sample to generate a score function estimate (See mapping of similar limitations applied to claim 1); and
predicting, using the diffusion machine learning model and based on the score function estimate, a new sample (See mapping of similar limitations applied to claim 1).
Regarding claim 12, dependent upon claim 9, claim 12 recites a method with elements corresponding to the elements recited in claim 4. Therefore, the recited elements of this claim are mapped to Dhariwal in the same manner as the corresponding elements in claim 4.
Regarding claim 13, dependent upon claim 12, claim 13 recites a method with elements corresponding to the elements recited in claim 5. Therefore, the recited elements of this claim are mapped to Dhariwal in the same manner as the corresponding elements in claim 5.
Regarding claim 16, dependent upon claim 9, claim 16 recites a method with elements corresponding to the elements recited in claim 8. Therefore, the recited elements of this claim are mapped to Dhariwal in the same manner as the corresponding elements in claim 8.
As analyzed above, claim 17 “means for generating”, “means for combining” and “means for predicting” are interpreted as invoking 112(f). The corresponding structure is a CPU, a GPU, or a processor etc (See above “Claim Interpretation” section). Accordingly the associated algorithms disclosed in the specification are read into the corresponding means. Listed in the following Table is the Algorithm 1 disclosed in the instant specification and Algorithm 1 disclosed in Dhariwal. It shows that the two algorithms are the same. Note the algorithms associated with “means for generating”, “means for combining” and “means for predicting” in claim 1 are included in the instant Algorithm 1. Therefore Dhariwal’s Algorithm 1 corresponds to the algorithms associated with “means for generating”, “means for combining” and “means for predicting” in claim 1. Similarly Dhariwal also covers algorithms associated with claim 19 “means for generating” and claim 20 “means for combining” and “means for generating”.
Instant Spec FIG. 7
PNG
media_image10.png
365
493
media_image10.png
Greyscale
Dhariwal page 7 Top Table
PNG
media_image5.png
242
785
media_image5.png
Greyscale
As per claim 17, Dhariwal teaches an apparatus (page 17 Table 7 3rd and 4th column; page 17 section “A.1 throughput” (see captured image below)
PNG
media_image1.png
109
764
media_image1.png
Greyscale
for providing test-time self-supervised guidance for a diffusion machine learning model, comprising:
means for generating a gradient associated with a current sample (Dhariwal’s GPU corresponds to the recited means. See below captured picture from page 6 section 4 “Classifier Guidance” second paragraph, in which xt is a current sample.
PNG
media_image2.png
88
765
media_image2.png
Greyscale
means for combining the gradient with an iterative model estimated score function or data associated with the current sample to generate a score function estimate (Dhariwal’s GPU corresponds to the recited means. See below captured picture from page 7 top table, in which in the expression
PNG
media_image3.png
31
253
media_image3.png
Greyscale
the gradient is combined with an iterative model estimated score function
PNG
media_image4.png
26
17
media_image4.png
Greyscale
. Note Dhariwal’s diffusion model is an iterative model (Abstract; page 7 top Table “Algorithm 1” (“for all t from T to 1”); page 9 Table 4 caption: Effect of classifier guidance on sample quality. Both conditional and unconditional models were trained for 2M iterations on ImageNet 256X256 with batch size 256)); and
PNG
media_image5.png
242
785
media_image5.png
Greyscale
means for predicting, using the diffusion machine learning model and based on the score function estimate, a new sample (Dhariwal’s GPU corresponds to the recited means. See above captured picture showing “Algorithm 1”. The generated sample xt-1 is a new sample).
As per claim 19, dependent upon claim 17, Dhariwal teaches the gradient is a classifier gradient generated using a trained classifier (See below captured picture from page 6 section 4 “Classifier Guidance” second paragraph),
PNG
media_image6.png
114
782
media_image6.png
Greyscale
and wherein the apparatus further comprises:
means for generating a prediction of a class label associated with the current sample, wherein the classifier gradient is based on the prediction of the class label (Dhariwal’s GPU corresponds to the recited means. See above captured picture, in which the gradient
PNG
media_image7.png
37
147
media_image7.png
Greyscale
is based on the prediction of the class label y).
As per claim 20, dependent upon claim 17, Dhariwal teaches the apparatus further comprises:
means for combining a respective gradient with respective iterative model data for each reverse diffusion sampling step of the diffusion machine learning model to generate respective combined data for each reverse diffusion sampling step (Dhariwal’s GPU corresponds to the recited means. See below captured picture showing Algorithm 1 from page 7, which shows a reverse diffusion sampling process. Algorithm 1 is performed in an iterative manner (“for all t from T to 1”), and for each step, a respective gradient (gradient based on current sample xt) is combined with respective iterative model data
PNG
media_image4.png
26
17
media_image4.png
Greyscale
); and
PNG
media_image5.png
242
785
media_image5.png
Greyscale
means for generating, for each reverse diffusion sampling step using the respective combined data, a respective new sample from a respective current sample (Dhariwal’s GPU corresponds to the recited means. See above “Algorithm 1”. In each step t, a respective new sample xt-1 is generated).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2-3, 6, 10-11, 14 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dhariwal, in view of Saxena et al. (Saxena S, Kar A, Norouzi M, Fleet DJ. Monocular depth estimation using diffusion models. arXiv preprint arXiv:2302.14816. 2023 Feb 28. Hereafter Saxena).
As per claim 2, Dhariwal teaches the current sample is associated with an input image (Abstract; FIG. 3; section 4.3). Dhariwal, however, does not teach the diffusion machine learning model comprises a diffusion-based depth estimation network.
Saxena in an analogous field discloses a method for monocular depth estimation using denoising diffusion models (Abstract). Specifically, Saxena discloses a diffusion-based depth estimation network “DepthGen” (Abstract; Fig. 1).
It would be obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of Dhariwal and the teaching of Saxena to include a diffusion-based depth estimation network. Doing so would enable multimodal depth inference and imputation of missing depths as recognized by Saxena (page 1 right col. last 3 lines).
As per claim 3, dependent upon claim 2, Dhariwal in view of Saxena teaches the new sample comprises a predicted depth of the input image (Saxena Fig. 1).
As per claim 6, dependent upon claim 4, Dhariwal in view of Saxena teaches the current sample is associated with an input image (Dhariwal Abstract; FIG. 3; section 4.3), and wherein the at least one processor coupled to the at least one memory is further configured to:
determine a photometric loss value based on a current sample depth from a diffusion model and an observed next frame (Saxena Fig. 1 “masked loss”. The loss is based on a current depth from a diffusion model (“Denoised predicted depth”) and an observed next frame (“GT depth with holes”)).
Regarding claim 10, dependent upon claim 9, claim 10 recites a method with elements corresponding to the elements recited in apparatus claim 2. Therefore, the recited elements of this claim are mapped to Dhariwal in view of Saxena in the same manner as the corresponding elements in claim 2.
Regarding claim 11, dependent upon claim 10, claim 11 recites a method with elements corresponding to the elements recited in apparatus claim 3. Therefore, the recited elements of this claim are mapped to Dhariwal in view of Saxena in the same manner as the corresponding elements in claim 3.
Regarding claim 14, dependent upon claim 12, claim 14 recites a method with elements corresponding to the elements recited in apparatus claim 6. Therefore, the recited elements of this claim are mapped to Dhariwal in view of Saxena in the same manner as the corresponding elements in claim 6.
Regarding claim 18, dependent upon claim 17, claim 18 recites a method with elements corresponding to the elements recited in apparatus claim 2. Therefore, the recited elements of this claim are mapped to Dhariwal in view of Saxena in the same manner as the corresponding elements in claim 2.
Claim(s) 7 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dhariwal in view of Saxena, and further in view of Saharia et al. (US 20230103638 A1, hereafter Saharia).
As per claim 7, dependent upon claim 6, Dhariwal in view of Saxena does not teach the classifier gradient comprises a gradient of the photometric loss value.
Saharia in an analogous filed discloses a diffusion model for denoising an image (Abstract; para. [0010]). During training of the diffusion model, one or more gradients of an objective function that measures an error between: (i) the predicted noise data, and (ii) the actual noise data in the noisy image, is calculated.
It would be obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teaching of Dhariwal and Saxena to incorporate the teaching of Saharia to include a gradient of the photometric loss value in the classifier gradient. Doing so would enforce convergence of the diffusion model during training as recognized by Saharia (para. [0072]).
Regarding claim 15, dependent upon claim 14, claim 15 recites a method with elements corresponding to the elements recited in apparatus claim 7. Therefore, the recited elements of this claim are mapped to Dhariwal in view of Saxena and Saharia in the same manner as the corresponding elements in claim 7.
Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XUEMEI G CHEN whose telephone number is (571)270-3480. The examiner can normally be reached Monday-Friday 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John M Villecco can be reached on (571) 272-7319. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/XUEMEI G CHEN/Primary Examiner, Art Unit 2661