Office Action Analysis: 18504356 — Super Resolution Image Generation

Office Action

§102 §103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments, see page 12, filed 02/17/2026, with respect to claims 4, 11, and 18 have been fully considered and are persuasive. The objections of claims 4, 11, and 18 have been withdrawn. 
Applicant’s arguments with respect to claims 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1, 3-8, 10-15, and 17-23 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claims contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventors, at the time the application was filed, had possession of the claimed invention. 
Claim 1 recites the limitation: “wherein the plurality of low-resolution images are acquired from the group consisting of: at different times, and by different sensors over at least partially overlapping areas.” The specification discloses temporal consistency and examples of images at different moments in time (paragraph [0014]), which supports acquisition at different times. The specification also references “overlapping consistency,” which describes a constraint applied when overlapping regions are present. However, the specification does not describe that the plurality of low-resolution images are actually acquired of overlapping areas. Rather, overlapping consistency is described as a rule or condition applied during training or evaluation. Furthermore, the specification does not describe or suggest that the plurality of low-resolution images are acquired by different sensors as the cited disclosure relating to sensors does not establish that multiple imaging sensors are used to acquire the plurality of low-resolution images. As claims 8 and 15 contain this identical subject matter, they are also rejected. Furthermore, claims 3-7, 10-14, 16-20, and 21-23 depend from claims 1, 8, and 15, and are rejected for the same reasons set forth for claims 1, 8, and 15.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 3-5, 7, 8, 10-12, 14-15, and 17-20 are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by Arefin et. al (“Multi-Image Super-Resolution for Remote Sensing using Deep Recurrent Networks”).

Regarding Claim 1, Arefin teaches a computer-implemented method comprising:
Abstract: “In this work, we present a data-driven, multi-image super resolution approach to alleviate these problems. Our approach is based on an end-to-end deep neural network that consists of an encoder, a fusion module, and a decoder.”
selecting, by one or more computer processors, a latent vector associated with a high-resolution image from a plurality of latent vectors of a generative neural network model, wherein the generative neural network model is trained using a set of rules based on physical knowledge and wherein the set of rules includes a rule that is selected from the group consisting of: a multiple timestamp consistency, and an overlapping consistency;
Abstract: “In this work, we present a data-driven, multi-image super resolution approach to alleviate these problems. Our approach is based on an end-to-end deep neural network that consists of an encoder, a fusion module, and a decoder.”
Introduction, pg. 1 and 2: “On the other hand, low-resolution data are plentiful and sometimes even publicly available at no cost. However, these may involve different acquisition sources, locations, or times, and may thus require special care in the way they are combined for super-resolution purposes… satellite imaging systems typically orbit the earth with pre-defined speeds and paths. Pixel-level inconsistencies can however still occur even with a well-calibrated system, and sub-pixel registration is often a necessary processing step for applications using several images at once… A simple solution to this problem is to increase the amount of input information by instead using multiple low-resolution images at once. This technique is called multi-image super-resolution (MISR). The challenge for MISR approaches then becomes information fusion (or registration) due to the noisy nature of the imaging process. In general, MISR is capable of more accurate high-resolution reconstructions than SISR as it aggregates more information extracted from multiple views of the target region.”
3.1 Problem Formulation, pg. 3: “An overview of our model is shown in Figure 1. It can be split into three modules (left to right): 1) Encoder, which encodes relevant features from the low-resolution images into latent representations…”
3.2 Encoder, pg. 4: “Given the input tensors (li)Ki=1, and q, the network is trained to produce feature representations, denoted by (ri)Ki=1.”
Fig. 3 caption: “Example of a candidate region for super-resolution: a) overlapping low-resolution input images, b) reconstructed super resolution image by MISR-GRU, c) target high-resolution image.”
Explanation: The reference discloses latent representations (feature vectors) with a plurality of latent vectors (r1, r2, rK), where selection occurs during processing of each latent representation. The reference also discloses training and model-based reconstruction, where the rules based on physical knowledge are constraints from sensor acquisition, temporal variation, and spatial consistency. The multiple timestamp consistency corresponds to the images acquired at different times, and the overlapping consistency corresponds to the overlapping views/regions. 
generating, by one or more computer processors, a super resolution image from the selected latent vector;
Abstract: “Finally, a decoder reconstructs the super-resolved image.”
3.1 Problem Formulation, pg. 3: “Decoder, which reconstructs the target high-resolution image.”
downscaling, by one or more computer processors, the super resolution image to match a size of a plurality of low-resolution images, wherein the plurality of low-resolution images are acquired from the group consisting of: at different times, and by different sensors over at least partially overlapping areas;
Introduction, pg. 1: “The processes behind the acquisition pipelines that determine the quality of the imagery rely heavily on the quality of the sensors themselves, whether electro-optical, radar, or laser-based. We approach the super-resolution problem from the image reconstruction perspective which aims at generating a high-resolution image based on one or more low-resolution images…On the other hand, low-resolution data are plentiful and sometimes even publicly available at no cost. However, these may involve different acquisition sources, locations, or times, and may thus require special care in the way they are combined for super-resolution purposes. In general, MISR is capable of more accurate high-resolution reconstructions than SISR as it aggregates more information extracted from multiple views of the target region.”
Related Work, pg. 3: “As detailed earlier, MISR approaches aim to reconstruct hidden high-resolution details using multiple low-resolution observations of the same scene…Many modern optimization-based approaches to MISR build a generative model that, given a high-resolution image, simulates the acquisition of low-resolution images.”
Fig. 3 caption: “Example of a candidate region for super-resolution: a) overlapping low-resolution input images, b) reconstructed super resolution image by MISR-GRU, c) target high-resolution image.”
Explanation: The reverse relationship is explicitly disclosed, and the simulation corresponds to downscaling HR [Wingdings font/0xE0] LR. They also disclose that the plurality of the LR images are acquired at different times and different sensors over overlapping areas. 
computing, by one or more computer processors, a difference between the down-scaled super resolution image and each of the plurality of low-resolution images;
Related Work, pg. 3: “An initial guess for the high-resolution image is then improved by minimization of the error between simulated and ground-truth low-resolution images.”Explanation: Error = difference between downscaled HR and LR images. 
determining, by one or more computer processors, a multi-image minimum difference of the difference between the down-scaled super resolution image and each of the plurality of low-resolution images;
Related Work, pg. 3: “An initial guess for the high-resolution image is then improved by minimization of the error between simulated and ground-truth low-resolution images.”3.6 Loss Function, pg. 5: “The typical way to formulate the super-resolution training objective is to minimize the reconstruction error between the target high-resolution image and the model’s prediction.”
Explanation: Optimization across multiple images = multi-image minimum difference.
determining, by one or more computer processors, whether the multi-image minimum difference meets a pre-defined stopping criterion, wherein the pre-defined stopping criteria is selected from the group consisting of: a threshold on the multi-image minimum difference, a satisfaction of the multiple timestamp consistency, and a satisfaction of the overlapping consistency;
3.6 Loss Function, pg. 5: “The typical way to formulate the super-resolution training objective is to minimize the reconstruction error between the target high-resolution image and the model’s pre diction. In this spirit, the Mean Squared Error (MSE) is commonly used in practice due to its interpretability and effectiveness.”
4.2 Experimental Setup: “The model was optimized end-to-end using Adam [24] starting with an initial learning rate of 0.0007 and gradual learning rate decay with a factor of 0.97 whenever the validation score plateaued for more than 2 epochs.”
Explanation: Plateau/convergence = stopping criterion. Threshold/metric = MSE. Temporal and spatial consistency shown above. 
responsive to the pre-defined stopping criteria being met, transmitting, by one or more computer processors, the super resolution image to a user; 
3.1 Problem Formulation, pg. 3: “The predicted output of our model is denoted by H…”
Explanation: Final output corresponds to transmission/output. 
and storing, by one or more computer processors, the super resolution image (Fig. 3 (shown below)).

    PNG
    media_image1.png
    363
    729
    media_image1.png
    Greyscale

Explanation: Generated outputs are stored/displayed for evaluation. 

Regarding Claim 3, Arefin teaches the computer-implemented method of claim 1, further comprising:
retrieving, by one or more computer processors, the plurality of low-resolution images;
Introduction, pg. 2: “A simple solution to this problem is to increase the amount of input information by instead using multiple low-resolution images at once. This technique is called multi-image super-resolution (MISR). The challenge for MISR approaches then becomes information fusion (or registration) due to the noisy nature of the imaging process. In general, MISR is capable of more accurate high-resolution reconstructions than SISR as it aggregates more information extracted from multiple views of the target region.”
Related Work, pg. 3: “As detailed earlier, MISR approaches aim to reconstruct hidden high-resolution details using multiple low-resolution observations of the same scene.”
and determining, by one or more computer processors, the size of each of the plurality of low-resolution images.
3.1 Problem Formulation, pg. 3: “We define the ith LR image of a scene as li ∈ Rc×h×w. Here, c, h, and w are the (channel-wise) depth, height, and width of the input LR image, respectively.”

Regarding Claim 4, Arefin teaches the computer-implemented method of claim 1, wherein the size of an image is selected from the group consisting of: a number of pixels, a spatial resolution, and a physical size.
Results, pg. 6: “The low-resolution images were prepared with a shape of 128×128 pixels…”
4.2 Experimental Setup, pg. 6: “Because of the memory constraint and to improve generalization by data augmentation, we trained our model with randomly cropped 64 × 64 LR and corresponding 192 × 192 HR patches. As our model is fully convolutional, at test time we feed full LR images of spatial size 128 × 128 as input.”

Regarding Claim 5, Arefin teaches the computer-implemented method of claim 1, wherein the difference between the down-scaled super resolution image and each of the plurality of low resolution images is selected from the group consisting of: an average difference between the down-scaled super resolution image and the plurality of low resolution images meets the pre-defined stopping criteria, and a percentage of difference values between the down-scaled super resolution image and the plurality of low resolution images meets the pre- defined stopping criteria.
Related Work, pg. 3: “An initial guess for the high-resolution image is then improved by minimization of the error between simulated and ground-truth low-resolution images.”
3.6 Loss Function, pg. 5: “The typical way to formulate the super-resolution training objective is to minimize the reconstruction error between the target high-resolution image and the model’s pre diction. In this spirit, the Mean Squared Error (MSE) is commonly used in practice due to its interpretability and effectiveness. In concordance with the evaluation guidelines of the challenge dataset (detailed in the next section), we opt to use a corrected metric for our loss function. We settle on a variant of the MSE called the corrected MSE (cMSE) which equalizes the brightness in both predicted and target images.”
Explanation: MSE = average difference. Loss thresholding corresponds to percentage/aggregate difference criteria. 

Regarding Claim 7, Arefin teaches the computer-implemented method of claim 1, wherein downscaling the super resolution image to match the size of the plurality of low-resolution images further comprises:
reducing, by one or more computer processors, a number of pixels associated with the super resolution image to match a number of pixels associated with the plurality of low-resolution images.
Related Work, pg. 3: “Many modern optimization-based approaches to MISR build a generative model that, given a high-resolution image, simulates the acquisition of low-resolution images.”
Results, pg. 6: “The low-resolution images were prepared with a shape of 128×128 pixels while the high-resolution (target) images contained 384 × 384 pixels…”
Explanation: This describes HR [Wingdings font/0xE0] LR transformation (downscaling), where HR has more pixels and LR has fewer pixels, requiring pixel reduction for matching. 

Regarding Claim 8, Arefin teaches all of the limitations with respect to claim 1 above. Arefin further teaches the one or more computer readable storage media that perform substantially the same steps as claim 1 as the reference discloses the implementation of a neural network model in PyTorch which requires program instructions stored in a computer-readable medium for execution. 
4.2 Experimental Setup, pg. 6: “Our model was implemented in PyTorch [35] and made publicly available3.”

Regarding Claim 10, Arefin teaches the computer program product of claim 8, and additional limitations are met as in the consideration of claim 3 above. 

Regarding Claim 11, Arefin teaches the computer program product of claim 8, and additional limitations are met as in the consideration of claim 4 above. 

Regarding Claim 12, Arefin teaches the computer program product of claim 8, and additional limitations are met as in the consideration of claim 5 above. 

Regarding Claim 14, Arefin teaches the computer program product of claim 8, and additional limitations are met as in the consideration of claim 7 above. 

Regarding Claim 15, Arefin teaches all of the limitations with respect to claim 1 above. Arefin further teaches the one or more computer readable storage media (see claim 8 above), one or more computer processors, and the one or more computer readable memories that perform substantially the same steps as claim 1.
4.2 Experimental Setup, pg. 6: “The training process took roughly 13 hours on a NVIDIA Titan RTX with memory of 24GB. During inference our model can super-resolve around 14 scenes per second in the same GPU if each scene is processed individually without batching.”
Explanation: GPU = processor

Regarding Claim 17, Arefin teaches the computer system of claim 15, and additional limitations are met as in the consideration of claim 3 above. 

Regarding Claim 18, Arefin teaches the computer system of claim 15, and additional limitations are met as in the consideration of claim 4 above. 

Regarding Claim 19, Arefin teaches the computer system of claim 15, and additional limitations are met as in the consideration of claim 5 above. 

Regarding Claim 20, Arefin teaches the computer system of claim 15, and additional limitations are met as in the consideration of claim 7 above. 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 6 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Arefin et. al in view of El-Khamy (US10489887B2).

Regarding Claim 6, Arefin teaches the computer-implemented method of claim 1, but fails to teach that, responsive to transmitting the super resolution image to the user, the method provides, by one or more computer processors, an opportunity to the user to accept the super resolution image via a user interface.
	However, El-Khamy teaches user interaction with a super-resolution system, including explicit UI-based acceptance/capture, stating that “according to some example embodiments, once the user is satisfied with a current frame and wants to capture the image, the user may transmit a signal (e.g., by selecting a button or prompt in a user interface for interacting with the progressive fusion SR imaging system 200) to the progressive fusion SR imaging system 200 to generate the high resolution image 206” (paragraph [0060]).
	Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the multi-image super resolution method of Arefin to include providing an opportunity for a user to accept the generated super-resolution image via a user interface. El-Khamy teaches a known technique of allowing a user to accept/capture a generated super-resolution image via a UI, which would improve usability by enabling user control over output selection and ensuring only satisfactory images are finalized. Incorporating such a user-interface mechanism into the system of Arefin would have been a predictable use of prior art elements according to their established functions to improve usability and user control, yielding predictable results. 

Regarding Claim 13, Arefin teaches the computer program product of claim 8, and additional limitations are met as in the consideration of claim 6 above. 

Claims 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Arefin et. al in view of Bora et. al (“Compressed Sensing using Generative Models”).

Regarding Claim 21, Arefin teaches the computer-implemented method of claim 1, but fails to teach that responsive to the pre-defined stopping criteria not being met, the method selects a different latent vector and repeats the generating, the downscaling, the computing, and the determining the multi-image minimum difference until the pre-defined stopping criteria is determined to be met.
	However, Bora explicitly teaches iterative latent vector selection and repetition until error minimization. Bora states that “our approach is to find a vector in representation space such that the corresponding vector in the sample space matches the observed measurements” (2 Our Algorithm, pg. 3), which directly corresponds to selecting a latent vector. Bora further teaches generating G(z) and computing a measurement difference, stating that “we thus define the objective to be loss(z) = || AG(z)−y||2” (2 Our Algorithm, pg. 3). Bora also discloses repeatedly updating (i.e., selecting a different latent vector z), repeating generation and computation, and continuing until convergence (i.e., stopping criteria satisfied), stating that “by using any optimization procedure, we can minimize loss(z) with respect to z…in particular, if the generative model G is differentiable, we can evaluate the gradients of the loss with respect to z using backpropagation and use standard gradient based optimizers” (2 Our Algorithm, pg. 3). Lastly, Bora states “if the optimization procedure terminates at ˆz, our reconstruction for x∗ is G(ˆz)” (2 Our Algorithm, pg. 3), which directly corresponds to stopping criteria and final latent vector selection. 
	Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate iterative latent vector optimization into Arefin’s multi-image super resolution method. Arefin already performs reconstruction using latent representations, and Bora provides a known method to improve reconstruction accuracy via iterative latent vector optimization. Thus, there existed a recognized problem (minimizing reconstruction error) and a finite number of predictable solutions (optimization techniques such as gradient descent). Thus, it would have been obvious to apply Bora’s iterative latent vector optimization within Arefin’s reconstruction framework to improve reconstruction accuracy, with a reasonable expectation of success. Additionally, a person of ordinary skill in the art would have been motivated to apply Bora’s iterative optimization technique to Arefin’s system because both operate in the same field (image reconstruction using learned representations), and such application would predictably improve reconstruction accuracy. 

Regarding Claim 22, Arefin teaches the computer program product of claim 8, and additional limitations are met as in the consideration of claim 21 above. 

Regarding Claim 23, Arefin teaches the computer system of claim 15, and additional limitations are met as in the consideration of claim 21 above. 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Park (US 20250124544 A1) teaches systems and methods for upsampling low-resolution content within a high-resolution image include obtaining a composite image and a mask.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM ADU-JAMFI whose telephone number is (571)272-9298. The examiner can normally be reached M-T 8:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Bee can be reached at (571) 270-5183. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/WILLIAM ADU-JAMFI/Examiner, Art Unit 2677                                                                                                                                                                                                        



/ANDREW W BEE/Supervisory Patent Examiner, Art Unit 2677
Read full office action
Super Resolution Image Generation

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Super Resolution Image Generation

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email