DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The IDS dated 4/29/2024 has been considered and placed in the application file.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claim(s) 10 are rejected under 35 U.S.C. 112(b), as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA the applicant regards as the invention.
Claim(s) 10 recites “wherein each compressed image in the training data is lossily compressed.” It is unclear if “lossily compressed” refers to lossy compression or lossless compression. For examination purposes, the examiner will interpret it as “lossy compression”.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-4, 11-14, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Karjauv (US 20250272795 A1) in view of Xiang (US 20230139962 A1).
Regarding claim 1, Karjauv discloses a method, comprising: receiving, by one or more processors, a compressed image comprising compression artifacts (Karjauv, paragraph [0094], "The downsample unit 530 is configured to downsample the blurred upsampled image 528 to generate a blurred LR image 532. The blurred LR image 532 corresponds to the noise-added image 432 and is a downsampled image that has a third size 534", a compressed image is interpreted as a smaller image with less data and a low resolution image is also a smaller image with less data),
training, by the one or more processors, an artificial intelligence (AI) model (Karjauv, paragraph [0148], “The instructions, when executed by the one or more processors, also cause the one or more processors to process the noise-added version of the input image using a trained machine learning model (e.g., the ML model 440) to generate an output image (e.g., the output image 442), wherein generation of the noise-added version of the input image is based on a noise-adding operation (e.g., the noise-adding operation 434) that was used during training of the machine learning model.”), by fine-tuning a diffusion model using training data comprising a plurality of training examples of compressed images annotated with respective compression quality factors, the diffusion model trained to perform super-resolution upscaling in accordance with an upscaling factor (Karjauv, paragraph [0089], "In a second illustrative example, the one or more processors 116 are configured to perform image enhancement according to a SR mode in which the machine learning model 440 corresponds to the SR model 450. The SR model 450 may correspond to a conventional SR model that is trained with synthetic pairs of low resolution (LR) and high resolution (HR) images, where the LR images used for training are obtained by processing the HR images with a blurring kernel (e.g., Gaussian), followed by downsampling (e.g., bicubic downsampling) the blurred HR images to generate LR images for training the SR model 450", SR means super-resolution).
While Karjauv discloses generating an upscaled image in accordance with the upscaling factor, the generating comprising providing the compressed image as input to the AI model (Karjauv, paragraph [0087], Fig. 4 below, “The one or more processors 116 are also configured to process the noise-added version 432 of the input image 422 using a trained ML model 440 to generate an output image 442.”),
PNG
media_image1.png
388
568
media_image1.png
Greyscale
They do not explicitly teach “generating, by the one or more processors, an output image comprising fewer compression artifacts than the compressed image”.
However, Xiang discloses generating, by the one or more processors, an output image comprising fewer compression artifacts than the compressed image (Xiang, paragraph [0026], "The reconstruction engine 240 may include a neural network with weights selected to generate the residuals that refine the features corresponding to the image content to be upsampled and to mitigate the features corresponding to the compression artifacts").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to implement Xiang’s AI model to generate Karjauv’s upscaled image with lessened artifacts/noise.
The suggestion/motivation for doing so would have been to further improve the quality of the images generated.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Karjauv in view of Xiang discloses and outputting, by the one or more processors, the output image on a display of one or more computing devices (Karjauv, paragraph [0061], "The display device 106 is configured to display output image data 107 corresponding to the output image 142 for viewing by a user of the device 102").
Therefore, it would have been obvious to combine Karjauv in view of Xiang to obtain the invention as specified in claim 1.
Regarding claim 2, Karjauv in view of Xiang discloses the method of claim 1, wherein training the AI model by fine-tuning the diffusion model comprises: determining, by the one or more processors, a loss using an output of the diffusion model from the plurality of training examples (Karjauv, paragraph [0045], "In this example, the output data is compared to the input data sample to generate a reconstruction loss, and parameters of the autoencoder are modified in an attempt to reduce (e.g., optimize) the reconstruction loss")*,
and updating, by the one or more processors and in accordance with the loss, one or more model parameter values of the diffusion model (Karjauv, paragraph [0053], "To illustrate, the denoiser 140 may be trained using one or more training sets of noisy images that are generated by adding synthetic noise to noise-free images, performing a forward pass of the ML model to generate a model output, computing a loss function based a difference between the model output and the target noise-free image, and updating parameters of the ML model based on the loss function using a gradient estimation process (e.g., backpropagation).").*
*As additionally evident by Wikipedia, neural networks have a loss function and are trained to update their parameters in accordance to the loss functions.
PNG
media_image2.png
418
856
media_image2.png
Greyscale
Regarding claim 3, Karjauv in view of Xiang discloses the method of claim 2, wherein generating the output image comprises: adding noise, by the one or more processors, to the compressed image along a plurality of diffusion steps corresponding to diffusion operations to add noise to the compressed image (Karjauv, paragraph [0088], "In this first example, the noise-adding operation 434 includes application of synthetic noise to the input image 422 to generate the noise-added version 432 of the input image 422 (also referred to herein as the noise-added image 432), and the synthetic noise can correspond to the synthetic noise 240 that is generated based on the first distribution 242 associated with the training of the denoiser 140, as described with reference to FIG. 2."),
and removing noise, by the one or more processors, from the noised compressed image along one or more denoising steps corresponding to denoising operations to remove noise and generate the output image (Xiang, paragraph [0046], "Block 406 may include computing residuals usable to refine the features corresponding to the image content to be upsampled and mitigate the features corresponding to the compression artifact").
Regarding claim 4, Karjauv in view of Xiang discloses the method of claim 3, wherein the AI model is a pixel-space diffusion model (Xiang, paragraph [0026], "The reconstruction engine 240 may include a neural network with weights selected to generate the residuals that refine the features corresponding to the image content to be upsampled and to mitigate the features corresponding to the compression artifacts"*).
*Using the definition from specification paragraph [0001], a pixel-space diffusion model is a model that denoises images.
Claims 11-14 corresponds to claims 1-4, additionally reciting a system (Karjauv, paragraph [0027], “Systems and methods to perform image enhancement are disclosed”), comprising:
One or more processors (Karjauv, paragraph [0033], “To illustrate, FIG. 1 depicts a device 102 including one or more processors (“processor(s)” 116 of FIG. 1), which indicates that in some implementations the device 102 includes a single processor 116 and in other implementations the device 102 includes multiple processors 116”). Thus, they are rejected for the same reasons of obviousness as claims 1-4.
Claim 20 corresponds to claim 1, additionally reciting one or more non-transitory computer-readable storage media, storing instructions that when executed by one or more processors (Karjauv, paragraph [0146], “In some implementations, a non-transitory computer-readable medium (e.g., a computer-readable storage device, such as the memory 110) includes instructions (e.g., the instructions 112) that, when executed by one or more processors (e.g., the one or more processors 116), cause the one or more processors to perform operations corresponding to at least a portion of any of the techniques described with reference to FIGS. 1-14, any of the methods of FIGS. 15-17, or any combination thereof”). Thus, they are rejected for the same reasons of obviousness as claim 1.
Claim(s) 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Karjauv (US 20250272795 A1) in view of Xiang (US 20230139962 A1) and in further view of Pan (US 20250014233 A1).
Regarding claim 5, Karjauv in view of Xiang discloses the method of claim 4.
Karjauv in view of Xiang does not teach “wherein removing noise from the noised compressed image along the one or more denoising steps comprises processing, by the one or more processors, the noised compressed image through a consistency model trained to generate the output image by evaluating a probabilistic flow ordinary differential equation (ODE)”.
However, Pan teaches wherein removing noise from the noised compressed image along the one or more denoising steps comprises processing, by the one or more processors, the noised compressed image through a consistency model trained to generate the output image by evaluating a probabilistic flow ordinary differential equation (ODE) (Pan, paragraph [0053], "The de-noising module 250 can execute a de-noising process by solving a deterministic probability-flow ordinary differential equation (ODE) instead of by a stochastic de-noising process (e.g., as represented by p.sub.θ(x.sub.k-1|x.sub.k) in FIG. 3)").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to implement a model that de-noises Karjauv’s (in view of Xiang) image through a PF-ODE, as taught by Pan.
The suggestion/motivation for doing so would have been to achieve de-noising in fewer steps.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Karjauv in view of Xiang and further view of Pan to obtain the invention as specified in claim 5.
Claim 15 corresponds to claim 5, additionally reciting a system (Karjauv, paragraph [0027], “Systems and methods to perform image enhancement are disclosed”). Thus, they are rejected for the same reasons of obviousness as claim 5.
Claim(s) 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Karjauv (US 20250272795 A1) in view of Xiang (US 20230139962 A1) and in further view of Kennett (US 20200389672 A1).
Regarding claim 7, Karjauv in view of Xiang discloses the method of claim 1.
Karjauv in view of Xiang does not teach “wherein receiving the training data comprises: receiving, by the one or more processors, an image compressed in accordance with a compression quality factor”.
However, Kennett teaches wherein receiving the training data comprises: receiving, by the one or more processors, an image compressed in accordance with a compression quality factor (Kennett, paragraph [0040], "For example, to save bandwidth resources, the encoder system 108 may compress the digital video to generate the encoded video content 210 having a lower resolution or bit rate than the original video content 208 to reduce bandwidth resources expended when providing the video content over the network 114").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to compress Karjauv’s (in view of Xiang) image to be smaller than the original input image, as taught by Kennett.
The suggestion/motivation for doing so would have been to provide for faster data processing of the image.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Karjauv in view of Xiang and Kennett discloses and generating, by the one or more processors, the compression quality factor as a label for the image (Karjauv, paragraph [0085], "The memory 110 is configured to store an input image 422 having a first size, and data (e.g., weights, biases, and/or other parameters) corresponding to a super-resolution model 450").
Therefore, it would have been obvious to combine Karjauv in view of Xiang and in further view of Kennett to obtain the invention as specified in claim 7.
Claim 17 corresponds to claim 7, additionally reciting a system (Karjauv, paragraph [0027], “Systems and methods to perform image enhancement are disclosed”). Thus, they are rejected for the same reasons of obviousness as claim 7.
Claim(s) 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Karjauv (US 20250272795 A1) in view of Xiang (US 20230139962 A1), Kennett (US 20200389672 A1), and in further view of Smith (US 20150302251 A1).
Regarding claim 9, Karjauv in view of Xiang and Kennett discloses the method of claim 7.
Karjauv in view of Xiang and Kennett does not teach “wherein receiving the training data further comprises: generating, by the one or more processors, the plurality of training examples with respective randomly selected compression quality factors”.
However, Smith teaches wherein receiving the training data further comprises: generating, by the one or more processors, the plurality of training examples with respective randomly selected compression quality factors (Smith, paragraph [0031], "In some embodiments, classification processor 104 can generate additional gaze data by performing random perturbation on the training gaze data (e.g., by making random adjustments to the resolution of the training images and/or the detected eye corner positions).').
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to randomly adjust Karjauv’s (in view of Xiang and Kennett) images’ resolution, as taught by Smith.
The suggestion/motivation for doing so would have been to provide a wider range of input images and improving AI model output accuracy.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Karjauv in view of Xiang, Kennett, and in further view of Smith to obtain the invention as specified in claim 9.
Claim 19 corresponds to claim 9, additionally reciting a system (Karjauv, paragraph [0027], “Systems and methods to perform image enhancement are disclosed”). Thus, they are rejected for the same reasons of obviousness as claim 9.
Claim(s) 10 are rejected under 35 U.S.C. 103 as being unpatentable over Karjauv (US 20250272795 A1) in view of Xiang (US 20230139962 A1), and in further view of Silberman (US 20220116052 A1).
Regarding claim 10, Karjauv in view of Xiang discloses the method of claim 1.
Karjauv in view of Xiang does not teach “wherein each compressed image in the training data is lossily compressed”.
However, Silberman teaches wherein each compressed image in the training data is lossily compressed (Silberman, paragraph [0092], "Thus, in some embodiments described above, the machine-learned model(s) of the autonomy system 140 can be previously trained using training sensor data that was previously compressed with the lossy compression (e.g., sensor data 155 that was collected onboard the same autonomous vehicle 105 and/or other autonomous vehicles 105 of the fleet).").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to lossy compress Karjauv’s (in view of Xiang) training data, as taught by Silberman.
The suggestion/motivation for doing so would have been to reduce file sizes and speed up the processing of the images.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Karjauv in view of Xiang and in further view of Silberman to obtain the invention as specified in claim 10.
Allowable Subject Matter
Claims 6, 8, 16, 18 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WAYNE ZHANG whose telephone number is (571) 272-0245. The examiner can normally be reached Monday-Friday 10:00-6:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ms. Sumati Lefkowitz can be reached on (571) 272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/WAYNE ZHANG/Examiner, Art Unit 2672
/SUMATI LEFKOWITZ/Supervisory Patent Examiner, Art Unit 2672