Prosecution Insights
Last updated: April 19, 2026
Application No. 18/697,686

CASCADED MULTI-RESOLUTION MACHINE LEARNING FOR IMAGE PROCESSING WITH IMPROVED COMPUTATIONAL EFFICIENCY

Non-Final OA §102§103
Filed
Apr 01, 2024
Examiner
GILLIARD, DELOMIA L
Art Unit
2661
Tech Center
2600 — Communications
Assignee
Google LLC
OA Round
1 (Non-Final)
90%
Grant Probability
Favorable
1-2
OA Rounds
2y 2m
To Grant
99%
With Interview

Examiner Intelligence

Grants 90% — above average
90%
Career Allow Rate
976 granted / 1089 resolved
+27.6% vs TC avg
Moderate +10% lift
Without
With
+10.2%
Interview Lift
resolved cases with interview
Typical timeline
2y 2m
Avg Prosecution
12 currently pending
Career history
1101
Total Applications
across all art units

Statute-Specific Performance

§101
10.0%
-30.0% vs TC avg
§103
48.8%
+8.8% vs TC avg
§102
15.5%
-24.5% vs TC avg
§112
11.3%
-28.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 1089 resolved cases

Office Action

§102 §103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Preliminary Amendment Preliminary Amendment filed April 4, 2024 has been entered. Claims 2-11 and 15 are currently amended. Claims 1-20 are pending. Claim Rejections - 35 USC § 102 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claim(s) 1-9, 16-18 and 20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting to Yi et al., hereinafter, “Yi”. Claim 1. (Original) Yi teaches A computing system for image modification with improved computational efficiency, the computing system comprising: Fig. 1: Inpainting results on ultra high-resolution images one or more processors; [4 Experimental Results] …two NVIDIA 1080 Ti GPUs and one or more non-transitory computer-readable media that collectively store instructions that, [4.3 Comparisons With Learning-based Methods] the proposed model can inpaint 4096×4096 images…GPU memory. when executed by the one or more processors, cause the computing system to perform operations, Fig. 2: The overall pipeline of the method: (top) CRA mechanism the operations comprising: obtaining a lower resolution version of an input image, Fig. 2 (top) raw input image downsampled to low resolution input image wherein the lower resolution version of the input image has a first resolution, [3.1 The Overall Pipeline] we first down-sample the image to 512 × 512 (first resolution) wherein the lower resolution version of the input image comprises one or more image elements to be modified with predicted image data; [3.3 Architecture of Generator] The generator takes an image and a binary mask indicating the hole regions as input and predicts a completed image… processing the lower resolution version of the input image with a first machine- learned model to generate an augmented image having the first resolution, [Fig. 2] Coarse Network (first machine-learned model) 512x512 image with mask/hole (augmented image) wherein the augmented image comprises first predicted image data replacing the one or more image elements; [3.3 Architecture of Generator] The prediction of the coarse network is naively blended with the input image by replacing the hole region [Fig. 2], under the Coarse Network, “replacing the hole region using the generated image” extracting a portion of the augmented image, wherein the portion of the augmented image comprises the first predicted image data; [Fig. 2] the Coarse Network outputted 256x256 image (BRI of portion of the augmented to image to be the entire image) upscaling the extracted portion of the augmented image to generate an upscaled image portion having an upscaled resolution; [Fig. 2] the Coarse Network the 256x256 image is upsampled to 512x512 processing the upscaled image portion with a second machine-learned model [Fig. 2] Refine Network (second machine learned model) to generate a refined portion, Yi [3.3 Architecture of Generator] the refine network predicts finer results wherein the refined portion comprises second predicted image data that modifies at least a portion of the first predicted image data; [3.3 Architecture of Generator] the refine network predicts finer result… The generator takes an image and a binary mask indicating the hole regions as input and predicts a completed image. The generator takes an image and a binary mask indicating the hole regions as input and predicts a completed image. Fig. 2 outputted refined (modified) 512x512 image. generating an output image based on the refined portion and a higher resolution version of the input image, [Fig. 2] the Refine Network outputs the 512x512 image (right side of the Refine Network) wherein both the output image and the higher resolution version of the input image have a second resolution that is greater than the first resolution; [Fig. 2] Overall pipeline (top) the Inpainted 512x512 image is outputted by the Generator is upsampled and the refined hole region are both higher resolution the raw input/low resolution input image. and providing the output image as an output. [Fig. 2] Overall pipeline (top) – Output image. Claim 2. (Currently Amended) Yi teaches wherein obtaining the lower resolution version of the input image comprises downscaling the higher resolution version of the input image to obtain the lower resolution version of the input image. [Fig. 2] Overall pipeline (top) – raw image downsampled to the low resolution input image Claim 3. (Currently Amended) Yi teaches wherein: processing the lower resolution version of the input image with the first machine-learned model to generate the augmented image comprises processing the lower resolution version of the input image and a mask that identifies the one or more image elements with a first machine- learned inpainting model to generate the augmented image having first inpainted image data that modifies the one or more image elements; [Fig. 2] the Coarse Network of the Generator – input and mask, [3.3 Architecture of Generator] and processing the upscaled image portion with the second machine-learned model to generate the refined portion comprises processing the upscaled image portion with a second machine-learned inpainting model to generate the refined portion having second inpainted image data that modifies at least a portion of the first inpainted image data. [Fig. 2] the Refine Network of the Generator –and replacing the hole region, [3.3 Architecture of Generator] Claim 4. (Currently Amended) Yi teaches wherein upscaling the extracted portion of the augmented image to generate the upscaled image portion having the upscaled resolution [Fig. 2] the Coarse Network the 256x256 image is upsampled to 512x512 comprises upscaling the extracted portion of the augmented image such that the upscaled resolution matches a corresponding resolution of a corresponding portion of the higher resolution version of the input image, [Fig. 2] Course Network 512x512 unsampled image (upscaled resolution) corresponding resolution of the 512x512 input image with mask wherein the corresponding portion proportionally corresponds to the extracted portion of the augmented image. [Fig. 2] Course Network 512x512 unsampled image (upscaled resolution) corresponding resolution of the 512x512 input image with mask the corresponding portions are proportional. Claim 5. (Currently Amended) Yi teaches wherein generating the output image based on the refined portion and the higher resolution version of the input image comprises inserting the refined portion into the higher resolution version of the input image. [Fig. 2] Refine Network 512x512 input image is the higher resolution version [3.3 Architecture of Generator] inputs are down-sampled to 256×256 before convolution in the coarse network, different from the refine network who operates on 512×512. The prediction of the coarse network is naively blended with the input image by replacing the hole region of the latter with that of the former as the input to the refine network. Claim 6. (Currently Amended) Yi teaches wherein the one or more image elements to be replaced comprise one or more user-designated image elements that have been designated based on one or more user inputs. [Fig. 7] The masks for Photoshop and Inpaint are manually drawn Claim 7. (Currently Amended) Yi teaches wherein the one or more image elements to be replaced are one or more computer-designated image elements, [Fig. 7] Photoshop content-aware fill and an open-source PatchMatch implementation wherein the one or more computer-designated image elements are designated by processing the input image with one or more classification sub-blocks of at least one of the first machine-learned model or the second machine-learned model. [Fig. 2] Discriminator Claim 8. (Currently Amended) Yi teaches wherein the first and the second predicted image data correspond to one or more of inpainting, deblurring, recoloring, or smoothing of the one or more image elements. [Introduction] These tasks require automated image inpainting, …High-quality inpainting usually requires generating visually realistic and semantically coherent content to fill the hole regions. [2.1 Irregular Hole-filling & Modified Convolutions]… visual artifacts such as color inconsistency, blurriness, and boundary artifacts… Claim 9. (Currently Amended) Yi teaches wherein: the one or more objects comprises a plurality of objects; [Fig. 2] Raw input image said processing the lower resolution version of the input image with the first machine- learned model to generate the augmented image is performed once; [Fig. 2] Course Network…downsampled to Input and Mask and said extracting, upscaling, and processing the upscaled image portion with the second machine-learned model are performed separately for each object of the plurality of objects. [Fig. 2] Refine Network (extracts patches, upsamples images and processes the upsampled image) Claim 16. (Original) Yi teaches One or more non-transitory computer readable media that collectively store instructions that, [4.3 Comparisons With Learning-based Methods] the proposed model can inpaint 4096×4096 images…GPU memory. when executed by one or more processors, [4 Experimental Results] …two NVIDIA 1080 Ti GPUs cause a computing system to perform operations, the operations comprising: Fig. 2: The overall pipeline of the method: (top) CRA mechanism obtaining a lower resolution version of an input image, Fig. 2 (top) raw input image downsampled to low resolution input image wherein the lower resolution version of the input image has a first resolution; [3.1 The Overall Pipeline] we first down-sample the image to 512 × 512 (first resolution) processing the lower resolution version of the input image with a first machine-learned model to generate a first predicted image having the first resolution, [Fig. 2] Coarse Network (first machine-learned model) 512x512 image with mask/hole (augmented image) [3.3 Architecture of Generator] The prediction of the coarse network is naively blended with the input image by replacing the hole region wherein the first predicted image comprises first predicted image data; [3.3 Architecture of Generator] The prediction of the coarse network is naively blended with the input image by replacing the hole region extracting a portion of the first predicted image, [Fig. 2] the Coarse Network outputted 256x256 image (BRI of portion of the predicted to image to be the entire image) wherein the portion of the first predicted image comprises the first predicted image data; [Fig. 2] the Coarse Network outputted 256x256 image (BRI of portion of the predicted to image to be the entire image) upscaling the extracted portion of the first predicted image to generate an upscaled image portion having an upscaled resolution; [Fig. 2] the Coarse Network the 256x256 image is upsampled to 512x512 and processing the upscaled image portion with a second machine-learned model [Fig. 2] Refine Network (second machine learned model) to generate a second predicted image, [3.3 Architecture of Generator] the refine network predicts finer results [3.3 Architecture of Generator] the refine network predicts finer result… The generator takes an image and a binary mask indicating the hole regions as input and predicts a completed image. The generator takes an image and a binary mask indicating the hole regions as input and predicts a completed image. wherein the second predicted image comprises second predicted image data that modifies at least a portion of the first predicted image data. [3.3 Architecture of Generator] the refine network predicts finer result, Fig. 2 outputted refined (modified) 512x512 image. Claim 17. (Original) Yi teaches wherein the first predicted image and the second predicted image comprise edge recognition images that indicate recognized edges in the input image. [1 Introduction]…train a convolutional network to model image-wide edge structure or foreground object contours, thus enabling auto-completion of the edge or contours Claim 18. (Original) Yi teaches wherein the first predicted image and the second predicted image comprise object detection images that indicate objects detected in the input image. [1 Introduction]… inpainting structured images like faces [10, 12, 17, 19, 20, 21], objects [11, 13, 14, 15] Claim 20. (Original) Yi teaches wherein the first predicted image and the second predicted image comprise face recognition images that indicate recognized faces in the input image. [1 Introduction]… inpainting structured images like faces [10, 12, 17, 19, 20, 21] Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting to Yi et al., hereinafter, “Yi” in view of US 2017/0132528 A1 to Aslan et al., hereinafter, “Aslan”. Claim 10. (Currently Amended) Yi fails to explicitly teach passing one or more internal feature vectors from the first machine-learned model to the second machine-learned model. Aslan, in the field of training a plurality of machine learning models, teaches further comprising passing one or more internal feature vectors from the first machine-learned model to the second machine-learned model. [0080] the first machine learning model is trained to learn the first task using a set of features from the training data (e.g., an n-dimensional feature vector of quantifiable information about an attribute of the data); and passing the information comprises providing the second machine learning model access to output from the first machine learning model Yi and Aslan are both in the same field of training machine learning models to analyze image data. Thus, before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the teachings of the machine learning models of Yi with the teachings of Aslan [0009] to providing more flexibility in model training. Claim(s) 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting to Yi et al., hereinafter, “Yi” in view of US 2022/0284613 A1 to Yin et al., hereinafter, “Yin”. Claim 11. (Currently Amended) Yi fails to explicitly teach the augmented image further comprises a predicted depth channel output by the first machine- learned model. Yin, in the field of training machine learning models, teaches wherein the augmented image further comprises a predicted depth channel output by the first machine- learned model. [0086] teaches Depth Prediction Machine-Learning Model 300 generating a predicted depth map Yi and Yin are both in the same field of training machine learning models to analyze image data. Thus, before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the teachings of the machine learning models of Yi with the teachings of Yin [0001-0002] to generate a robust, diverse, and accurate monocular depth prediction model. Claim(s) 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting to Yi et al., hereinafter, “Yi” in view of US 2021/0335004 A1 to Zohar et al., hereinafter, “Zohar”. Claim 19. (Original) Yi fails to explicitly teach the first predicted image and the second predicted image comprise human keypoint estimation images that indicate human keypoints detected in the input image. Zohar, in the field of training machine learning models to extract features of skeletal joints, teaches wherein the first predicted image and the second predicted image comprise human keypoint estimation images that indicate human keypoints detected in the input image. Zohar [0094] teaches machine learning to extract and predict skeletal joint positions (human keypoints), for one or more frames Yi and Zohar are both in the same field of training machine learning models to analyze image data. Thus, before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the teachings of the machine learning models of Yi with the teachings of Zohar [0018] to effectively detect the pose of an object. Allowable Subject Matter Claims 12-15 are allowed. In regards to Claim 12, the closest prior art is US 2021/0218961 A1 to Kanazawa et al., teaches: (Original) A computer-implemented method for training machine learning models to perform image modification, the method comprising: [Abstract] A computing system can include a processor, a machine-learned image segmentation model comprising a semantic segmentation neural network and an edge refinement neural network, FIG. 7, [0006] The semantic segmentation neural network can be trained… The edge refinement neural network can be trained… receiving, by a computing system comprising one or more processors, [Abstract] A computing system can include a processor a lower resolution version of an input image and a ground truth image, [0042] inputting a training image into the image segmentation model… Each training image can have, for example, corresponding ground-truth versions [0031] …inputting the low resolution image into the semantic segmentation neural network. a loss function that evaluates a difference between the predicted image and the ground truth image; [0044] the first loss function can be determined by, for example, determining a difference between the semantic segmentation mask (understood to be the output (predicted image)) and a ground-truth semantic segmentation mask Kanazawa and other prior art search fails to explicitly teach “wherein the lower resolution version of the input image has a first resolution and the ground truth image has a second resolution that is greater than the first resolution, and wherein the lower resolution version of the input image comprises one or more image elements not present in the ground truth image. Likewise claims 13-15 are allowed because they are dependents of claim 12. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to DELOMIA L GILLIARD whose telephone number is (571)272-1681. The examiner can normally be reached 8am-5pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John Villecco can be reached at (571) 272-7319. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /DELOMIA L GILLIARD/Primary Examiner, Art Unit 2661
Read full office action

Prosecution Timeline

Apr 01, 2024
Application Filed
Jan 24, 2026
Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602805
DATA TRANSMISSION THROTTLING AND DATA QUALITY UPDATING FOR A SLAM DEVICE
2y 5m to grant Granted Apr 14, 2026
Patent 12602932
SYSTEMS AND METHODS FOR MONITORING USERS EXITING A VEHICLE
2y 5m to grant Granted Apr 14, 2026
Patent 12602796
SYSTEM, DEVICE, AND METHODS FOR DETECTING AND OBTAINING INFORMATION ON OBJECTS IN A VEHICLE
2y 5m to grant Granted Apr 14, 2026
Patent 12602952
IMAGE-BASED AUTOMATED ERGONOMIC RISK ROOT CAUSE AND SOLUTION IDENTIFICATION SYSTEM AND METHOD
2y 5m to grant Granted Apr 14, 2026
Patent 12602895
MACHINE LEARNING-BASED DOCUMENT SPLITTING AND LABELING IN AN ELECTRONIC DOCUMENT SYSTEM
2y 5m to grant Granted Apr 14, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
90%
Grant Probability
99%
With Interview (+10.2%)
2y 2m
Median Time to Grant
Low
PTA Risk
Based on 1089 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month