Last updated: May 29, 2026
Application No. 18/432,169
METHOD, DEVICE, AND COMPUTER PROGRAM PRODUCT FOR GENERATING SUPER RESOLUTION IMAGE

Non-Final OA §101§102§103§112
Filed
Feb 05, 2024
Priority
Jan 12, 2024 — CN 202410051905.5
Examiner
DICKERSON, CHAD S
Art Unit
2683
Tech Center
2600 — Communications
Assignee
DELL PRODUCTS, L.P.
OA Round
1 (Non-Final)
This examiner grants 63% of cases after interview

— +23.0% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 600 resolved cases, 2023–2026
Examiner Intelligence

DICKERSON, CHAD S View full profile →
Grants 63% of resolved cases
Career Allowance Rate
376 granted / 600 resolved
+0.7% vs TC avg
Strong +23% interview lift
Without
With
+23.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
24 currently pending
Career history
638
Total Applications
across all art units
Statute-Specific Performance

§101
0.4%
-39.6% vs TC avg
§103
93.9%
+53.9% vs TC avg
§102
3.4%
-36.6% vs TC avg
§112
1.7%
-38.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 600 resolved cases
Office Action

§101 §102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: 
METHOD, DEVICE, AND COMPUTER PROGRAM PRODUCT FOR GENERATING SUPER RESOLUTION IMAGE COMPRISING CASCADING NETWORKS THAT GENERATE SUPER RESOLUTION IMAGES AND RESIDUAL IMAGES THAT ARE INPUT INTO OTHER NETWORKS UNTIL A DESIRED RESOLUTION IS REACHED.

Claim Rejections - 35 USC § 112

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 11 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  The term “first network” can be interpreted as a brain, a communication network or assumed to be a neural network.  Since the boundaries of this claim term is not clearly delineated and the scope is unclear, this claim term is considered as indefinite.  Claims 2-10 and 12-19 are rejected based on their dependency.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim(s) recite(s) an abstract idea of a mental step that involve the use of a separate tool.  For example, the use can create an image of a higher resolution based on viewing an initial image.  The user can also create an outline of specific important aspects of the image.  These two pieces of images can be passed to another person to use in creating an image of greater resolution.  This judicial exception is not integrated into a practical application because it links the abstract idea to a technological environment with the use of the term “network”. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the back projection of the invention is considered well-understood, routine, conventional activity.  Claims 11 and 20 have additional elements of a processor and memory to execute instructions for the invention as well as computer program product of a non-transitory computer readable medium storing instructions for the function of the machine.  These limitations also link the invention to a technological environment and do not amount to significantly more than the judicial exception.  

Claim2 is similar to claim 3 with the additional of another network.  Another user can determine an outline of aspects of the higher resolution image along with producing the second higher resolution image and pass this along to another creator to use this information to create a third higher resolution image.  Like the first claim, the “network” term links this claim to a technological environment, which is considered as not integrated into a practical application.  The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the back projection of the invention is considered well-understood, routine, conventional activity.  This also applies to claim 12.
Claim 3 is a mental step of deciding when to continue or stop creating a higher resolution image from a previous iteration.  This limitation, whether taken alone or in combination, does not integrate the abstract idea into a practical application.  Neither does this claim include additional elements that are considered to amount to significantly more than the judicial exception based on deciding the whether the data should be further manipulated, which is a form of insignificant extra solution activity.  This also applies to claim 13.

In Claim 4, the first two limitations are considered as mental steps of deciding an approach to use an initial image to create a map of features of the image and generating image patches based on the map of features and a gird.  Another or the same artist can use the image patches to determine how to make a portion or all of the image a higher resolution than the original.  The use of a LIIF module appears to used to perform an aspect of the abstract idea, but it does not integrate it into a practical application.  This also links this limitation to a technological environment, which is not considered to amount to significantly more than the judicial exception.  This also applies to claim 14.

For Claim 5, identifying content in an image to determine a category for it, is considered as a mental step.  A user can practice techniques to properly draw, in more detail, a specific image associated with multiple type, and select one of those techniques to process, or create, the super resolution image.  The training of models is considered as linking the invention to a specific technological environment, which does not integrate the abstract ideas into a practical application nor amount to significantly more than the judicial exception.  This also applies to claim 15.
 Claim 6 contains steps that can be performed by a user mentally or with the use of a tool.  A user can perform identifying content based on a manual scoring system associated with image characteristics to determine a category for the image.  The user can then pass it along to another , or use their own professional judgement, to judge the texture of the image or how rich the color appears on the page.  The use of the phrase “multi-classifier” or “trained models” are considered as phrases that link the claims to a technological environment and do not integrate the abstract idea into a practical application nor amount to significantly more than the judicial exception.  This also applies to claim 16.

Claim 7 discloses a continuous function, which is a mathematical equation involving calculations.  This mathematical function is used to generate overlapping image patches based on the map of features and grid.  This is considered as an abstract idea.  The use of the MLP layer of the LIIF module is considered as linking the invention to a technological environment, which does not integrate the abstract idea into a practical application nor amount to significantly more than the judicial exception.  This also applies to claim 17.

Claim 8 discloses a mental step of determining a residual image between a down-sampled image and an original image.  This alone, or in combination, does not integrate the abstract idea into a practical application.  The down-sampling step using back projection is considered as a well-understood, routine and conventional activity that does not amount to more than the judicial exception.  This also applies to claim 18.

Claim 9 discloses the up-sampling of first residual image by several modules of a network.  This is considered as selecting data to manipulate, which is an additional feature that does not integrate the abstract idea into a practical application.  The adding of the up-sampled first residual image to a second image is considered a well-understood, routine and conventional activity that does not amount to significantly more than the judicial exception.  This also applies to claim 19.  

Claim 10 states the type of the first image, which is considered as selecting a type of data to be manipulated, which is considered as insignificant extra-solution activity.  Where the network is provided on is linking the network to a particular technological environment.  None of these limitations integrate the abstract idea into a practical application nor are considered to amount to significantly more than the judicial exception.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1 and 20 is/are rejected under 35 U.S.C. 102(a1 and/or a2) as being anticipated by Gao (Pub Date: 10/17/2018, titled: Image Super-resolution based on two-level residual learning CNN).

Re claim 1: Gao discloses a method for generating a super resolution image, comprising:
generating, by a first network and based on a first image of first resolution, a second image of first super resolution (e.g. on page 3 under section 2 titled “Two-level residual learning CNN”, the paper discusses a RLCNN that generates a first high-resolution image based on the input of a low-resolution image.  This is illustrated in figure 1.  The first box in figure 1 is represented by a residual learning network, or considered as the first network.);
determining, by the first network, a first residual image based on the first image and the second image (e.g. the first residual network determines a first residual image based on the high-resolution image and the low-resolution image, which is seen in figure 1 and taught on page 3.  The low-resolution image is used to determine a high-resolution image, which is then used to determine a residual image.); and
generating, by a second network, a third image of second super resolution based on the first residual image and the second image, wherein the first super resolution is higher than the first resolution and the second super resolution is higher than the first super resolution (e.g. a similar network receives both a residual image and a high-resolution image received from a previous network.  The second network outputs a collection of residuals and weights to be included with a resolution in order to be output as a higher resolution final output.  This is taught on page 5 in section 2.2 in Second-level residual learning CNN.).

Re claim 20: Gao discloses a computer program product, the computer program product being tangibly stored on a non-transitory computer-readable medium and comprising machine-executable instructions, wherein the machine-executable instructions, when executed by a machine, cause the machine to perform actions comprising:
	generating, by a first network and based on a first image of first resolution, a second image of first super resolution (e.g. on page 3 under section 2 titled “Two-level residual learning CNN”, the paper discusses a RLCNN that generates a first high-resolution image based on the input of a low-resolution image.  This is illustrated in figure 1.  The first box in figure 1 is represented by a residual learning network, or considered as the first network.);
determining, by the first network, a first residual image based on the first image and the second image (e.g. the first residual network determines a first residual image based on the high-resolution image and the low-resolution image, which is seen in figure 1 and taught on page 3.  The low-resolution image is used to determine a high-resolution image, which is then used to determine a residual image.); and
generating, by a second network, a third image of second super resolution based on the first residual image and the second image, wherein the first super resolution is higher than the first resolution and the second super resolution is higher than the first super resolution (e.g. a similar network receives both a residual image and a high-resolution image received from a previous network.  The second network outputs a collection of residuals and weights to be included with a resolution in order to be output as a higher resolution final output.  This is taught on page 5 in section 2.2 in Second-level residual learning CNN.).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 2, 3 and 11-13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gao in view of Cui (Pub Date: 2014, title: Deep Network Cascade for Image Super-Resolution).

Re claim 2: Gao discloses the method according to claim 1, further comprising:
	determining, by the second network, a second residual image based on the third image and the first image (e.g. the second residual network outputs a super resolution image based on the input of the initial first resolution image used to develop the input of the high resolution image input into the second network and the second residual image output from the second network.  This is seen in figure 1 and on page 3.).
However, Gao fails to specifically teach the features of generating, by a third network, a fourth image of third super resolution based on the second residual image and the third image, wherein the third super resolution is higher than the second super resolution.
However, this is well known in the art as evidenced by Cui.  Similar to the primary reference, Cui discloses a deep network cascade for image super-resolution (same field of endeavor or reasonably pertinent to the problem).     
Cui discloses generating, by a third network, a fourth image of third super resolution based on the second residual image and the third image, wherein the third super resolution is higher than the second super resolution (e.g. on the fourth page (i.e. page 52), the page discusses a deep network of cascade layers that can successively upscale an image for a final super resolution image by using different network units.  With a cascade of several network units utilizing the network of the primary reference, this can gradually upscale a low-resolution image with the increase of network layers, which is taught on pages 4 and 5 (i.e. pages 52 and 53).). 
Therefore, in view of Cui, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of generating, by a third network, a fourth image of third super resolution based on the second residual image and the third image, wherein the third super resolution is higher than the second super resolution, incorporated in the device of Gao, in order to gradually upscale a low-resolution image with the increase of network layers , which can aid in achieving an improvement in visual quality and quantitative performance (as stated in Cui Abstract).   

Re claim 3: However, Gao fails to specifically teach the features of the method according to claim 2, further comprising: 
determining whether the third super resolution of the fourth image reaches a desired resolution, and stopping iteration if the third super resolution of the fourth image reaches the desired resolution.
However, this is well known in the art as evidenced by Cui.  Similar to the primary reference, Cui discloses a deep network cascade for image super-resolution (same field of endeavor or reasonably pertinent to the problem).     
Cui discloses further comprising: 
determining whether the third super resolution of the fourth image reaches a desired resolution, and stopping iteration if the third super resolution of the fourth image reaches the desired resolution (e.g. the paper discloses on page 8 (i.e. page 56) that the algorithm iterates the cascading of networks, which is called layers or network units, until a satisfied solution is reached.  The satisfied solution is considered as the image scale is reached.  The layers represent a network unit that can be a third or fourth network unit that upscales an image for a super resolution image.  The system can stop at the desired upscale that has a higher resolution and stop the iterations of the network units.  This is seen in the algorithm on page 7 (i.e. 56).).
Therefore, in view of Cui, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of further comprising: determining whether the third super resolution of the fourth image reaches a desired resolution, and stopping iteration if the third super resolution of the fourth image reaches the desired resolution, incorporated in the device of Gao, in order to gradually upscale a low-resolution image with the increase of network layers, which can aid in achieving an improvement in visual quality and quantitative performance (as stated in Cui Abstract).   

Re claim 11: Gao discloses an electronic device, comprising:
cause the electronic device to perform actions comprising:
generating, by a first network and based on a first image of first resolution, a second image of first super resolution (e.g. on page 3 under section 2 titled “Two-level residual learning CNN”, the paper discusses a RLCNN that generates a first high-resolution image based on the input of a low-resolution image.  This is illustrated in figure 1.  The first box in figure 1 is represented by a residual learning network, or considered as the first network.);
determining, by the first network, a first residual image based on the first image and the second image (e.g. the first residual network determines a first residual image based on the high-resolution image and the low-resolution image, which is seen in figure 1 and taught on page 3.  The low-resolution image is used to determine a high-resolution image, which is then used to determine a residual image.); and
generating, by a second network, a third image of second super resolution based on the first residual image and the second image, wherein the first super resolution is higher than the first resolution and the second super resolution is higher than the first super resolution (e.g. a similar network receives both a residual image and a high-resolution image received from a previous network.  The second network outputs a collection of residuals and weights to be included with a resolution in order to be output as a higher resolution final output.  This is taught on page 5 in section 2.2 in Second-level residual learning CNN.).
However, Gao fails to specifically teach the features of at least one processor; and a memory coupled to the at least one processor and having instructions stored therein, wherein the instructions, when executed by the at least one processor.
However, this is well known in the art as evidenced by Cui.  Similar to the primary reference, Cui discloses a deep network cascade for image super-resolution (same field of endeavor or reasonably pertinent to the problem).     
Cui discloses at least one processor; and a memory coupled to the at least one processor and having instructions stored therein, wherein the instructions, when executed by the at least one processor (e.g. the system discloses a memory that can be optimized in the system, in section 4.2 titled Optimization and on page 7 (i.e. 55).  Page 14 (i.e. 63) discloses a general PC that stores MATLAB code or an algorithm parallelized with the GPU in the Computational Efficiency section in part 5.2.).
Therefore, in view of Cui, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of at least one processor; and a memory coupled to the at least one processor and having instructions stored therein, wherein the instructions, when executed by the at least one processor, incorporated in the device of Gao, in order to gradually upscale a low-resolution image with the increase of network layers , which can aid in achieving an improvement in visual quality and quantitative performance (as stated in Cui Abstract).   


Re claim 12: Gao discloses the electronic device according to claim 11, wherein the actions further comprise:
determining, by the second network, a second residual image based on the third image and the first image (e.g. the second residual network outputs a super resolution image based on the input of the initial first resolution image used to develop the input of the high-resolution image input into the second network and the second residual image output from the second network.  This is seen in figure 1 and on page 3.).
However, Gao fails to specifically teach the features of generating, by a third network, a fourth image of third super resolution based on the second residual image and the third image, wherein the third super resolution is higher than the second super resolution.
However, this is well known in the art as evidenced by Cui.  Similar to the primary reference, Cui discloses a deep network cascade for image super-resolution (same field of endeavor or reasonably pertinent to the problem).     
Cui discloses generating, by a third network, a fourth image of third super resolution based on the second residual image and the third image, wherein the third super resolution is higher than the second super resolution (e.g. on the fourth page (i.e. page 52), the page discusses a deep network of cascade layers that can successively upscale an image for a final super resolution image by using different network units.  With a cascade of several network units utilizing the network of the primary reference, this can gradually upscale a low-resolution image with the increase of network layers, which is taught on pages 4 and 5 (i.e. pages 52 and 53).). 
Therefore, in view of Cui, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of generating, by a third network, a fourth image of third super resolution based on the second residual image and the third image, wherein the third super resolution is higher than the second super resolution, incorporated in the device of Gao, in order to gradually upscale a low-resolution image with the increase of network layers , which can aid in achieving an improvement in visual quality and quantitative performance (as stated in Cui Abstract).   

Re claim 13: However, Gao fails to specifically teach the features of the electronic device according to claim 12, wherein the actions further comprise: determining whether the third super resolution of the fourth image reaches a desired resolution, and stopping iteration if the third super resolution of the fourth image reaches the desired resolution.
However, this is well known in the art as evidenced by Cui.  Similar to the primary reference, Cui discloses a deep network cascade for image super-resolution (same field of endeavor or reasonably pertinent to the problem).     
Cui discloses wherein the actions further comprise: determining whether the third super resolution of the fourth image reaches a desired resolution, and stopping iteration if the third super resolution of the fourth image reaches the desired resolution (e.g. the paper discloses on page 8 (i.e. page 56) that the algorithm iterates the cascading of networks, which is called layers or network units, until a satisfied solution is reached.  The satisfied solution is considered as the image scale is reached.  The layers represent a network unit that can be a third or fourth network unit that upscales an image for a super resolution image.  The system can stop at the desired upscale that has a higher resolution and stop the iterations of the network units.  This is seen in the algorithm on page 7 (i.e. 56).).
Therefore, in view of Cui, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein the actions further comprise: determining whether the third super resolution of the fourth image reaches a desired resolution, and stopping iteration if the third super resolution of the fourth image reaches the desired resolution, incorporated in the device of Gao, in order to gradually upscale a low-resolution image with the increase of network layers, which can aid in achieving an improvement in visual quality and quantitative performance (as stated in Cui Abstract).   


Claim(s) 4-7 and 14-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gao in view of Lee (US Pub 2025/0131537 (filing date: 10/18/2023)) and Chen (Pub Date: 4/1/2021) Titled: Learning Continuous Image Representation with Local Implicit Image Function).

Re claim 4: However, Gao fails to specifically teach the features of the method according to claim 1, wherein generating the second image based on the first image comprises:
selecting, in a model selection module of the first network, a particular model based on a category of the first image to process the first image to obtain a feature map;
generating, in a local implicit image function (LIIF) module of the first network, image patches based on the feature map and a search coordinate grid; and
generating, in an image ensemble module of the first network, the second image based on the image patches.
However, an aspect of this is well known in the art as evidenced by Lee.  Similar to the primary reference, Lee discloses selecting models based on the type of images (same field of endeavor or reasonably pertinent to the problem).     
Lee discloses wherein generating the second image based on the first image comprises:
selecting, in a model selection module of the first network, a particular model based on a category of the first image to process the first image to obtain a feature map (e.g. the system discloses determining the type of image that is evaluated by the image quality processor, which is taught in ¶ [69]-[72] and [75].  The image quality processor gathers a model based on the category or type of image and uses this model, or neural network, to extract features from an input image.  The system discloses using a first neural network, which can be one of the networks or models to process an image, to extract a feature map from an input image.  This is taught in ¶ 87] and [113]-[116].).

[0069] According to an embodiment of the disclosure, the plurality of reference models may include at least one of an image quality processing model trained based on training images having different quality values, an image quality processing model trained based on training images corresponding to different types of content, or an image quality processing model trained based on training images corresponding to different genres of content.

[0070] The image processing device 100 according to an embodiment of the disclosure may compare a quality value of training images used to train each of the plurality of reference models with a quality value of the input image, and thus, may identify a reference model trained based on training images having a similar quality of a quality of the input image.

[0071] The image processing device 100 according to an embodiment of the disclosure may compare a content characteristic of training images used to train each of the plurality of reference models with a content characteristic of the input image, and thus, may obtain a reference model trained based on training images corresponding to content matching the content characteristic of the input image. That the content characteristic matches may include that a type of content or a genre of content may be identical to or similar with each other. In an example case in which content of the input image is streaming content, the image processing device 100 may obtain a reference model trained based on training images corresponding to streaming content. In an example case in which a genre of content of the input image is a video call, the image processing device 100 may obtain a reference model trained based on person-focused training images.

[0072] In an example case in which a plurality of reference models are identified, the image processing device 100 according to an embodiment of the disclosure may interpolate the plurality of reference models so as to generate the meta model. For example, the image processing device 100 may apply a weight to each of the identified plurality of reference models, and may generate the meta model by performing weighted sum on each of the reference models to which the weight is applied. Here, the weight applied to each of the reference models may be determined based on a difference between a quality value corresponding to the reference model and a quality value of the input image. According to another embodiment, the weight applied to each of the reference models may be determined based on a difference between a type of content or a genre of the content corresponding to the reference model and a type of content or a genre of the content of the input image.


[0075] The image processing device 100 according to an embodiment of the disclosure may identify a category of the input image, and may select an image included in the identified category, from training data stored in the DB. The image processing device 100 may use the image selected from the training data, as the training image (first data). The first data may include the input image and a training image that is included in a category of the input image. This will be described in detail with reference to FIG. 9B.

[0087] The quality analysis unit 310 may analyze or evaluate the image quality of the input image by using a neural network trained to analyze or evaluate an image quality of an input image. For example, the neural network may be a neural network trained to evaluate an image quality of an image or a video, by using an IQA technique, a VQA technique, etc. For example, a first neural network may be a neural network trained to receive an input of an input image and output a kernel sigma value indicating a blur quality of the input image and a QF indicating a compression quality of the image. A structure of the first neural network may be equally expressed as the first neural network 400 of FIG. 4. A quality of an output input image may be represented on a quality plane graph of FIG. 5.

[0113] In an embodiment of the disclosure, the first neural network 400 may include, but is not limited to, an input layer, a hidden layer, and an output layer. In an embodiment of the disclosure, the first neural network 400 may be a deep neural network (DNN) including a plurality of hidden layers as the hidden layer.

[0114] The first neural network 400 according to an embodiment of the disclosure may be trained by using a training DB, which includes each training image and quality values corresponding to the each training image as a training data set. For example, the training data set may include degraded images generated by compressing, blurring, or adding noise to high-resolution images in various ways, and quality values (answer or label) of the deteriorated images. For example, the first neural network 400 may be trained to output the quality value of the degraded image based on the degraded image is input to the first neural network 400. That is, the first neural network 400 may be trained to output the quality value of the degraded image when the degraded image is input to the first neural network 400.

[0115] For example, an input image including R, G, and B channels (RGB 3ch) may be input to the first neural network 400 shown in FIG. 4. According to another embodiment, a first image including Y, U, and V channels (YUV 3ch) may be input.

[0116] The first neural network 400 according to an embodiment of the disclosure may receive the input image including the R, G, and B channels (RGB 3ch) or the Y, U, and V channels (YUV 3ch), and may perform a convolution calculation by applying one or more kernels or filters to the first image, thereby extracting a feature map. For example, the first neural network 400 may output 32 channels by applying 32 3×3 filters to the input image. The first neural network 400 may scan a target of the convolution calculation pixel by pixel from left to right and from top to bottom, and may multiply the target by weight values included in the kernel and may add the result, thereby generating a result value. Data subject to the convolution calculation may be scanned by moving pixel by pixel, but may be scanned by moving by two or more pixels. During the scanning process, the number of pixels that input data moves is referred to as a stride, and a size of the output feature map may be determined according to a size of the stride.

Therefore, in view of Lee, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein generating the second image based on the first image comprises: selecting, in a model selection module of the first network, a particular model based on a category of the first image to process the first image to obtain a feature map, incorporated in the device of Gao, as modified by the features of Cui, in order to select a model based on the type of image to acquire a feature map, which can improve the quality of images in a still or video (as stated in Lee ¶ [06]-[10]).   
However, the combination above fails to specifically teach the features of generating, in a local implicit image function (LIIF) module of the first network, image patches based on the feature map and a search coordinate grid; and generating, in an image ensemble module of the first network, the second image based on the image patches.
However, this is well known in the art as evidenced by Chen.  Similar to the primary reference, Chen discloses producing an image with LIIF (same field of endeavor or reasonably pertinent to the problem).     
Chen discloses generating, in a local implicit image function (LIIF) module of the first network, image patches based on the feature map and a search coordinate grid (e.g. the paper discloses the idea of maps coordinates of an object within an image to features of the image signal.  On pages 1 and 2 in the Introduction section, this is presented with an image.  Section 3 titled Local Implicit Image Function discloses feature map that uses a coordinate system in the image domain to develop the continuous image, which is on page 3.  The system can further operate within a patch-based on context using the same feature map and coordinate system within the continuous image domain, which is taught on pages 5, 6 and 8.); and
generating, in an image ensemble module of the first network, the second image based on the image patches (e.g. based on the system being able to operate in a patch-based manner, the local ensemble of patches to create a prediction that results in a particular image, which is taught on pages 3 and 4 in the local ensemble section and the patch-based operation is taught on pages 5, 6 and 8.).
Therefore, in view of Chen, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of generating, in a local implicit image function (LIIF) module of the first network, image patches based on the feature map and a search coordinate grid; and generating, in an image ensemble module of the first network, the second image based on the image patches, incorporated in the device of Gao, as modified by Cui and Lee, in order to utilize LIIF to generate a higher resolution image, which improves the feature of providing learning tasks with size-varied image ground truths (as stated in Chen Abstract and page 2).   

Re claim 5: However, Gao fails to specifically teach the features of the method according to claim 4, wherein selecting the particular model to process the first image comprises:
identifying content of the first image to determine the category of the first image; and training models in the model selection module separately for corresponding categories of images; and
selecting, for the first image, the particular model based on performance of corresponding models in the trained models to process the first image.
However, an aspect of this is well known in the art as evidenced by Lee.  Similar to the primary reference, Lee discloses selecting models based on the type of images (same field of endeavor or reasonably pertinent to the problem).     
Lee discloses wherein selecting the particular model to process the first image comprises:
identifying content of the first image to determine the category of the first image; and training models in the model selection module separately for corresponding categories of images (e.g. the image input into the system is determined in order to select a model to process the category of image, which is taught in ¶ [69]-[72] and [75].  The system discloses training certain models specific to the type or category of image input, which is taught in ¶ [69]-[72] and [75] above.); and
selecting, for the first image, the particular model based on performance of corresponding models in the trained models to process the first image (e.g. the particular reference models are selected based on the content in the image and if the model can process the content related to the image, which is taught in ¶ [69]-[72] and [75] above.).
Therefore, in view of Lee, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein selecting the particular model to process the first image comprises: identifying content of the first image to determine the category of the first image; and training models in the model selection module separately for corresponding categories of images; and selecting, for the first image, the particular model based on performance of corresponding models in the trained models to process the first image, incorporated in the device of Gao, as modified by Cui, in order to select a model based on the type of image to acquire a feature map, which can improve the quality of images in a still or video (as stated in Lee ¶ [06]-[10]).   

Re claim 6: However, Gao fails to specifically teach the features of the method according to claim 5, 
	wherein a multi-classifier of the model selection module is used to identify the content of the first image based on a classification score to determine the category of the first image; and 	wherein the performance of the corresponding models in the trained models is determined based on non-reference metrics for perceptual estimation.
However, an aspect of this is well known in the art as evidenced by Lee.  Similar to the primary reference, Lee discloses selecting models based on the type of images (same field of endeavor or reasonably pertinent to the problem).     
Lee discloses wherein a multi-classifier of the model selection module is used to identify the content of the first image based on a classification score to determine the category of the first image (e.g. the calculation of a probability value is used to identify the category of the input image.  The value represents how probable the image represents a particular category or an image associated with a particular category, which is taught in ¶ [160]-[162] and [197]-[200].); and 

[0160] According to an embodiment of the disclosure, the training DB generator 321 may obtain an image corresponding to the viewing information, and may use the image as a training image (first data). For example, the training DB generator 321 may select an image corresponding to content having a same or similar type and genre, from training data stored in a DB 327. The training data may include high-resolution images. The training data may be stored in the external database, or may be stored in an internal memory of the image processing device 100. The first data may include a training image corresponding to the input image and content having a same or similar characteristic (type and genre) of the input image.

[0161] According to another embodiment, the training DB generator 321 according to an embodiment of the disclosure may identify a category of an input image. For example, the training DB generator 321 may identify the category of the input image by a probability value. The training DB generator 321 may identify a category of a highest probability value as the category of the input image, and may select images in the identified category, from training data stored in the DB 327. The training DB generator 321 may obtain first data including an input image and a training image included in a category of the input image.

[0162] According to another embodiment, the training DB generator 321 according to an embodiment of the disclosure may obtain a preset number of images from among images that is included in the same category of the input image. According to another embodiment, the training DB generator 321 may identify a preset number of categories in order of high probability values from among categories of an input image, and may obtain images included in the identified categories, in proportion to the probability values. In an example case in which the training DB generator 321 determines that a probability an object included in an input image is a dog is 70% and a cat is 30%, a dog image and a cat image from among training data may be obtained at a ratio of 7:3.


[0197] The training DB generator 321 according to an embodiment of the disclosure may identify the category of the input image by using a third neural network 930. The third neural network 930 according to an embodiment of the disclosure may be an algorithm that receives images and classifies image categories from the input images, or a set of algorithms, software for executing a set of algorithms, and/or hardware for executing a set of algorithms.

[0198] The third neural network 930 according to an embodiment of the disclosure may use a softmax regression function to obtain various classes or categories as a result. The softmax function may be used when there are a plurality of correct answers (classes) that must be classified. For example, the softmax function may be used when predicting a plurality of classes. In an example case in which a total number of classes is k, the softmax function may receive a k-dimensional vector and may estimate a probability for each class. The third neural network 930 according to an embodiment of the disclosure may be a neural network that receives a k-dimensional vector and is trained such that the probability for each class obtained therefrom is equal to a correct answer set. However, the disclosure is not limited thereto, and the third neural network 930 may be implemented as algorithms of various types capable of classifying image categories from an input image.

[0199] The third neural network 930 according to an embodiment of the disclosure may obtain a probability value for a category or class of the input image as a result. For example, the third neural network 930 may obtain vectors representing the probability that the category of the input image is a human face, a dog, a cat, and a building as 0.5, 0.2, 0.2, and 0.1, respectively, as the result value.

[0200] The training DB generator 321 according to an embodiment of the disclosure may identify a category with the highest probability as the category of the input image. For example, in the example above, the training DB generator 321 may identify the category of the input image as the human face, which is the category with the greatest vector value.

wherein the performance of the corresponding models in the trained models is determined based on non-reference metrics for perceptual estimation (e.g. the system discloses selecting particular models and training these models based on image quality of an image, which is used to estimate the performance of the model.  This is taught in ¶ [52]-[54].).

[0052] The image processing device 100 according to an embodiment of the disclosure may train a meta model, based on an image quality of the input image 110 and viewing information related to the input image 110. According to an embodiment, the viewing information may be referred to as watching information. The image quality of the input image 110 may indicate a degree of degradation of an image which is analyzed by an image quality analyzer. The viewing information related to the input image 110 may include auxiliary information related to the input image 110 other than the degree of degradation of the image which is analyzed by the image quality analyzer. For example, the auxiliary information, may include, but is not limited to, an attribute of the input image, a characteristic of the input image, or a watching environment when the input image 110 is watched. For example, the viewing information related to the input image 110 may include, but is not limited to, compression information of the input image 110, a type of content, a genre of the content, etc. Also, the viewing information related to the input image 110 may include, but is not limited to, an ambient environment of a device, a viewing distance between a user and the device, user personal information, etc. According to an embodiment, the viewing distance may be referred to as watching distance.

[0053] For example, the image processing device 100 may generate training data of the meta model, based on the image quality of the input image 110 and the viewing information related to the input image 110. For example, the image processing device 100 may generate (or obtain) the meta model, based on the image quality of the input image 110 and the viewing information related to the input image 110.

[0054] The image processing device 100 according to an embodiment of the disclosure may train the meta model, based on the image quality of the input image 110 and the viewing information related to the input image 110. Accordingly, the image processing device 100 may obtain the meta model in which a parameter is optimized according to the image quality of the input image 110 and the watching environment of the user. The image processing device 100 may generate the high-resolution output image 120, based on the meta model having the parameter optimized according to the image quality of the input image 110 and the watching environment of the user.

Therefore, in view of Lee, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein a multi-classifier of the model selection module is used to identify the content of the first image based on a classification score to determine the category of the first image; and wherein the performance of the corresponding models in the trained models is determined based on non-reference metrics for perceptual estimation, incorporated in the device of Gao, as modified by Cui, in order to utilize a score or value to identify content or type of an image and train models using metrics to process the input image, which can improve the quality of images in a still or video (as stated in Lee ¶ [06]-[10]).   

Re claim 7: However, Gao fails to specifically teach the features of the method according to claim 4, wherein generating the image patches based on the feature map and the search coordinate grid comprises:
generating, in a multilayer perceptron (MLP) layer of the LIIF module, the image patches based on the feature map and the search grid using a continuous function, wherein the image patches are overlapping.
However, this is well known in the art as evidenced by Chen.  Similar to the primary reference, Chen discloses producing an image with LIIF (same field of endeavor or reasonably pertinent to the problem).     
Chen discloses wherein generating the image patches based on the feature map and the search coordinate grid comprises:
generating, in a multilayer perceptron (MLP) layer of the LIIF module, the image patches based on the feature map and the search grid using a continuous function, wherein the image patches are overlapping (e.g. the paper describes LIIF operating in a patch environment.  The MLP layers are used in the decoding function of the LIIF.  The images, or patches, are based on a feature map and coordinates of the feature map using a continuous function of the continuous image.  This is taught in section 3 in the Local Implicit Image Function on pages 3 and 4.  The patch-based method with the MLP layers are described on pages 5, 6 and 8.).

Therefore, in view of Chen, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein generating the image patches based on the feature map and the search coordinate grid comprises: generating, in a multilayer perceptron (MLP) layer of the LIIF module, the image patches based on the feature map and the search grid using a continuous function, wherein the image patches are overlapping, incorporated in the device of Gao, as modified by Cui and Lee, in order to utilize LIIF to generate a higher resolution image, which improves the feature of providing learning tasks with size-varied image ground truths (as stated in Chen Abstract and page 2).   


Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gao in view of Lee (US Pub 2025/0131537 (filing date: 10/18/2023)).

Re claim 10: However, Gao fails to specifically teach the features of the method according to claim 1, wherein the first image comprises a static image or an image frame extracted from a motion video, and wherein the first network is provided on an edge device or a cloud.
However, an aspect of this is well known in the art as evidenced by Lee.  Similar to the primary reference, Lee discloses selecting models based on the type of images (same field of endeavor or reasonably pertinent to the problem).     
Lee discloses wherein the first image comprises a static image or an image frame extracted from a motion video, and wherein the first network is provided on an edge device or a cloud (e.g. the system works with a static image or a frame image of a video, which is taught in ¶ [61].  The network within the image processing device can be performed on a server, which is taught in ¶ [111].).

[0061] In an example case in which the input image is any one of a plurality of frame images included in video content, the image processing device 100 according to an embodiment of the disclosure may obtain quality information of the input image, based on quality information of each of the plurality of frame images.

[0111] According to an embodiment, the first neural network 400 may receive various data and may be trained to learn or discover a method of analyzing the input data, a method of classifying the input data, and/or a method of extracting a feature necessary for generating result data from the input data. For example, the first neural network 400 may be trained to implement a method of analyzing the input data, a method of classifying the input data, and/or a method of extracting a feature necessary for generating result data from the input data. The first neural network 400 may be generated as an artificial intelligence model with desired characteristics by applying a training algorithm to a plurality of pieces of training data. This training may be performed in the image processing device 100 itself or may be performed via a separate server/system.

Therefore, in view of Lee, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein the first image comprises a static image or an image frame extracted from a motion video, and wherein the first network is provided on an edge device or a cloud, incorporated in the device of Gao, as modified by Cui, in order to select a model based on the type of image to acquire a feature map, which can improve the quality of images in a still or video (as stated in Lee ¶ [06]-[10]).   

Re claim 14: However, Gao fails to specifically teach the features of the electronic device according to claim 11, wherein generating the second image based on the first image comprises:
selecting, in a model selection module of the first network, a particular model based on a category of the first image to process the first image to obtain a feature map;
	generating, in a local implicit image function (LIIF) module of the first network, image patches based on the feature map and a search coordinate grid; and
generating, in an image ensemble module of the first network, the second image based on the image patches.

However, an aspect of this is well known in the art as evidenced by Lee.  Similar to the primary reference, Lee discloses selecting models based on the type of images (same field of endeavor or reasonably pertinent to the problem).     
Lee discloses wherein generating the second image based on the first image comprises: selecting, in a model selection module of the first network, a particular model based on a category of the first image to process the first image to obtain a feature map (e.g. the system discloses determining the type of image that is evaluated by the image quality processor, which is taught in ¶ [69]-[72] and [75] above.  The image quality processor gathers a model based on the category or type of image and uses this model, or neural network, to extract features from an input image.  The system discloses using a first neural network, which can be one of the networks or models to process an image, to extract a feature map from an input image.  This is taught in ¶ 87] and [113]-[116] above.).

Therefore, in view of Lee, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein generating the second image based on the first image comprises: selecting, in a model selection module of the first network, a particular model based on a category of the first image to process the first image to obtain a feature map, incorporated in the device of Gao, as modified by the features of Cui, in order to select a model based on the type of image to acquire a feature map, which can improve the quality of images in a still or video (as stated in Lee ¶ [06]-[10]).   

However, the combination above fails to specifically teach the features of generating, in a local implicit image function (LIIF) module of the first network, image patches based on the feature map and a search coordinate grid; and generating, in an image ensemble module of the first network, the second image based on the image patches.
However, this is well known in the art as evidenced by Chen.  Similar to the primary reference, Chen discloses producing an image with LIIF (same field of endeavor or reasonably pertinent to the problem).     
Chen discloses generating, in a local implicit image function (LIIF) module of the first network, image patches based on the feature map and a search coordinate grid (e.g. the paper discloses the idea of maps coordinates of an object within an image to features of the image signal.  On pages 1 and 2 in the Introduction section, this is presented with an image.  Section 3 titled Local Implicit Image Function discloses feature map that uses a coordinate system in the image domain to develop the continuous image, which is on page 3.  The system can further operate within a patch-based on context using the same feature map and coordinate system within the continuous image domain, which is taught on pages 5, 6 and 8.); and
generating, in an image ensemble module of the first network, the second image based on the image patches (e.g. based on the system being able to operate in a patch-based manner, the local ensemble of patches to create a prediction that results in a particular image, which is taught on pages 3 and 4 in the local ensemble section and the patch-based operation is taught on pages 5, 6 and 8.).
Therefore, in view of Chen, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of generating, in a local implicit image function (LIIF) module of the first network, image patches based on the feature map and a search coordinate grid; and generating, in an image ensemble module of the first network, the second image based on the image patches, incorporated in the device of Gao, as modified by Cui and Lee, in order to utilize LIIF to generate a higher resolution image, which improves the feature of providing learning tasks with size-varied image ground truths (as stated in Chen Abstract and page 2).   

Re claim 15: However, Gao fails to specifically teach the features of the electronic device according to claim 14, wherein selecting the particular model to process the first image comprises:
identifying content of the first image to determine the category of the first image; and training models in the model selection module separately for corresponding categories of images; and
selecting, for the first image, the particular model based on performance of corresponding models in the trained models to process the first image.
However, an aspect of this is well known in the art as evidenced by Lee.  Similar to the primary reference, Lee discloses selecting models based on the type of images (same field of endeavor or reasonably pertinent to the problem).     
Lee discloses wherein selecting the particular model to process the first image comprises: identifying content of the first image to determine the category of the first image; and training models in the model selection module separately for corresponding categories of images (e.g. the image input into the system is determined in order to select a model to process the category of image, which is taught in ¶ [69]-[72] and [75].  The system discloses training certain models specific to the type or category of image input, which is taught in ¶ [69]-[72] and [75] above.); and
selecting, for the first image, the particular model based on performance of corresponding models in the trained models to process the first image (e.g. the particular reference models are selected based on the content in the image and if the model can process the content related to the image, which is taught in ¶ [69]-[72] and [75] above.).
Therefore, in view of Lee, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein selecting the particular model to process the first image comprises: identifying content of the first image to determine the category of the first image; and training models in the model selection module separately for corresponding categories of images; and selecting, for the first image, the particular model based on performance of corresponding models in the trained models to process the first image, incorporated in the device of Gao, as modified by Cui, in order to select a model based on the type of image to acquire a feature map, which can improve the quality of images in a still or video (as stated in Lee ¶ [06]-[10]).   

Re claim 16: However, Gao fails to specifically teach the features of the electronic device according to claim 15, wherein the model selection module comprises a multi-classifier which identifies the content of the first image based on a classification score to determine the category of the first image; and
wherein the model selection module determines the performance of the corresponding models in the trained models based on non-reference metrics for perceptual estimation.
However, an aspect of this is well known in the art as evidenced by Lee.  Similar to the primary reference, Lee discloses selecting models based on the type of images (same field of endeavor or reasonably pertinent to the problem).     
Lee discloses wherein the model selection module comprises a multi-classifier which identifies the content of the first image based on a classification score to determine the category of the first image (e.g. the calculation of a probability value is used to identify the category of the input image.  The value represents how probable the image represents a particular category or an image associated with a particular category, which is taught in ¶ [160]-[162] and [197]-[200] above.); and 
wherein the model selection module determines the performance of the corresponding models in the trained models based on non-reference metrics for perceptual estimation (e.g. the system discloses selecting particular models and training these models based on image quality of an image, which is used to estimate the performance of the model.  This is taught in ¶ [52]-[54] above.).

Therefore, in view of Lee, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein the model selection module comprises a multi-classifier which identifies the content of the first image based on a classification score to determine the category of the first image; and wherein the model selection module determines the performance of the corresponding models in the trained models based on non-reference metrics for perceptual estimation, incorporated in the device of Gao, as modified by Cui, in order to utilize a score or value to identify content or type of an image and train models using metrics to process the input image, which can improve the quality of images in a still or video (as stated in Lee ¶ [06]-[10]).   

Re claim 17: However, Gao fails to specifically teach the features of the electronic device according to claim 14, wherein generating the image patches based on the feature map and the search coordinate grid comprises: generating, in a multilayer perceptron (MLP) layer of the LIIF module, the image patches based on the feature map and the search grid using a continuous function, wherein the image patches are overlapping.

However, this is well known in the art as evidenced by Chen.  Similar to the primary reference, Chen discloses producing an image with LIIF (same field of endeavor or reasonably pertinent to the problem).     
Chen discloses wherein generating the image patches based on the feature map and the search coordinate grid comprises: generating, in a multilayer perceptron (MLP) layer of the LIIF module, the image patches based on the feature map and the search grid using a continuous function, wherein the image patches are overlapping (e.g. the paper describes LIIF operating in a patch environment.  The MLP layers are used in the decoding function of the LIIF.  The images, or patches, are based on a feature map and coordinates of the feature map using a continuous function of the continuous image.  This is taught in section 3 in the Local Implicit Image Function on pages 3 and 4.  The patch-based method with the MLP layers are described on pages 5, 6 and 8.).

Therefore, in view of Chen, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein generating the image patches based on the feature map and the search coordinate grid comprises: generating, in a multilayer perceptron (MLP) layer of the LIIF module, the image patches based on the feature map and the search grid using a continuous function, wherein the image patches are overlapping, incorporated in the device of Gao, as modified by Cui and Lee, in order to utilize LIIF to generate a higher resolution image, which improves the feature of providing learning tasks with size-varied image ground truths (as stated in Chen Abstract and page 2).   


Claim(s) 7 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gao, as modified by Lee, as applied to claims 8 and 18 above, and further in view of Chen.


Claim(s) 8 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gao in view of Lui (Pub Date: 6/20/2019, title: Hierarchical Back Projection Network for Image Super-Resolution).

Re claim 8: However, Gao fails to specifically teach the features of the method according to claim 1, wherein determining the first residual image based on the first image and the second image comprises:
	down-sampling the second image in a back projection module of the first network; and 
determining the first residual image between the down-sampled second image and the first image.
However, this is well known in the art as evidenced by Liu.  Similar to the primary reference, Liu discloses back projection network for image super resolution (same field of endeavor or reasonably pertinent to the problem).     
Liu discloses wherein determining the first residual image based on the first image and the second image comprises:
	down-sampling the second image in a back projection module of the first network (e.g. the system discloses down-sampling a super resolution image of a network, which is taught on page 3 and illustrated in figure 2.); and 
determining the first residual image between the down-sampled second image and the first image (e.g. the system determines the residual image based on the down-sampled super resolution image and the low-resolution image, which is taught on page 3 in the Hierarchical Back Projection Network section and seen in figure 2.).
Therefore, in view of Liu, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein determining the first residual image based on the first image and the second image comprises: down-sampling the second image in a back projection module of the first network; and determining the first residual image between the down-sampled second image and the first image, incorporated in the device of Gao, as modified by Lee and Chen, in order to perform a back projection approach to super resolution, which can minimize the super resolution residual information and allow for extraction of more compact features for reconstruction (as stated in Liu page 3).   

Re claim 18: However, Gao fails to specifically teach the features of the electronic device according to claim 11, wherein determining the first residual image based on the first image and the second image comprises: down-sampling the second image in a back projection module of the first network; and determining the first residual image between the down-sampled second image and the first image.

However, this is well known in the art as evidenced by Liu.  Similar to the primary reference, Liu discloses back projection network for image super resolution (same field of endeavor or reasonably pertinent to the problem).     
Liu discloses wherein determining the first residual image based on the first image and the second image comprises: down-sampling the second image in a back projection module of the first network (e.g. the system discloses down-sampling a super resolution image of a network, which is taught on page 3 and illustrated in figure 2.); and 
determining the first residual image between the down-sampled second image and the first image (e.g. the system determines the residual image based on the down-sampled super resolution image and the low-resolution image, which is taught on page 3 in the Hierarchical Back Projection Network section and seen in figure 2.).
Therefore, in view of Liu, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein determining the first residual image based on the first image and the second image comprises: down-sampling the second image in a back projection module of the first network; and determining the first residual image between the down-sampled second image and the first image, incorporated in the device of Gao, as modified by Lee and Chen, in order to perform a back projection approach to super resolution, which can minimize the super resolution residual information and allow for extraction of more compact features for reconstruction (as stated in Liu page 3).   

Claim(s) 9 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gao, as modified by Liu, as applied to claims 8 and 18 above, and further in view of Lee and Chen.

Re claim 9: However, Gao fails to specifically teach the features of the method according to claim 8, wherein generating the third image based on the first residual image and the second image comprises:
up-sampling the first residual image in a model selection module, an LIIF module, and an image ensemble module of the second network; and
adding the up-sampled first residual image to the second image to generate the third image.

However, an aspect of this is well known in the art as evidenced by Lee.  Similar to the primary reference, Lee discloses selecting models based on the type of images (same field of endeavor or reasonably pertinent to the problem).     
Lee discloses wherein generating the third image based on the first residual image and the second image comprises:
up-sampling the first residual image in a model selection module of the second network (e.g. the system discloses down-sampling a training image, which can serve as a low-resolution image output from the training database generator.  This is taught in ¶ [163] and [211].  Later in the process, the down sampled image can be upscaled to convert the low-resolution image to a high-resolution image, which is taught in ¶ [49] and [50].).

[0049] The image processing device 100 according to an embodiment of the disclosure may perform image quality processing on an image. For example, the image processing device 100 may obtain an output image 120 by performing image quality processing on an input image 110. For example, the image processing device 100 may obtain a high-resolution (or high quality) output image by upscaling a low-resolution (or low quality) input image, by using an image quality processing model.

[0050] The image quality processing model according to an embodiment of the disclosure may include a neural network model configured to implement an upscaling algorithm capable of converting a low-resolution image into a high-resolution image. For example, the image quality processing model may include a neural network model trained, from a neural network model obtained based on a quality of the input image 110, by using training data corresponding to the input image 110.


[0163] According to an embodiment of the disclosure, the training DB generator 321 may generate an image (second data) of which image quality is degraded to have an image quality corresponding to the quality value and the viewing information of the input image. The training DB generator 321 may degrade training images included in the first data, and thus, may generate an image degraded from a training image, according to a degree of degradation of the input image. For example, the training DB generator 321 may generate the image quality degraded image by performing at least one of methods including compression degradation, blurring, noise addition, and down sampling on the selected training images.

[0211] For example, the training DB generator 321 may perform filtering to degrade the images included in the first data 920. The training DB generator 321 may use two-dimensional kernel to apply blur degradation to the images. According to another embodiment, the training DB generator 321 may process box blur to perform modelling on motion degradation. According to another embodiment, the training DB generator 321 may use a Gaussian filter to apply a shape or optical blur to the images. According to another embodiment, the training DB generator 321 may perform down sampling on the images included in the first data 920, based on resolution information from among the compression information of the input image.

Therefore, in view of Lee, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein generating the third image based on the first residual image and the second image comprises: up-sampling the first residual image in a model selection module of the second network, incorporated in the device of Gao, as modified by Cui, in order to select a model based on the type of image to acquire a feature map, which can improve the quality of images in a still or video (as stated in Lee ¶ [06]-[10]).   

However, this is well known in the art as evidenced by Chen.  Similar to the primary reference, Chen discloses producing an image with LIIF (same field of endeavor or reasonably pertinent to the problem).     
Chen discloses an LIIF module, and an image ensemble module of the second network (e.g. the paper describes LIIF operating in a patch environment.  The MLP layers are used in the decoding function of the LIIF.  A local ensemble is described on pages 3 and 4 to predict an overall signal.  In section 3 in the Local Implicit Image Function section on pages 3 and 4, the LIIF function is described.  The patch-based method with the MLP layers is described on pages 5, 6 and 8.  These aspects incorporated in the second network of the primary reference performs the features of the claims.).
Therefore, in view of Chen, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of an LIIF module, and an image ensemble module of the second network, incorporated in the device of Gao, as modified by Cui and Lee, in order to utilize LIIF to generate a higher resolution image, which improves the feature of providing learning tasks with size-varied image ground truths (as stated in Chen Abstract and page 2).   
However, the combination above fails to specifically teach the features of adding the up-sampled first residual image to the second image to generate the third image.
However, this is well known in the art as evidenced by Liu.  Similar to the primary reference, Liu discloses back projection network for image super resolution (same field of endeavor or reasonably pertinent to the problem).     
Liu discloses adding the up-sampled first residual image to the second image to generate the third image (e.g. the system discloses adding an up-sampled residual image to the super resolution image to generate an updated super resolution image, which is taught on page 3 and illustrated in figure 2.).
Therefore, in view of Liu, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of adding the up-sampled first residual image to the second image to generate the third image, incorporated in the device of Gao, as modified by Lee and Chen, in order to perform a back projection approach to super resolution, which can minimize the super resolution residual information and allow for extraction of more compact features for reconstruction (as stated in Liu page 3).   

Re claim 19: However, Gao fails to specifically teach the features of the electronic device according to claim 18, wherein generating the third image based on the first residual image and the second image comprises: up-sampling the first residual image in a model selection module, an LIIF module, and an image ensemble module of the second network; and adding the up-sampled first residual image to the second image to generate the third image.

However, an aspect of this is well known in the art as evidenced by Lee.  Similar to the primary reference, Lee discloses selecting models based on the type of images (same field of endeavor or reasonably pertinent to the problem).     
Lee discloses wherein generating the third image based on the first residual image and the second image comprises: up-sampling the first residual image in a model selection module of the second network (e.g. the system discloses down-sampling a training image, which can serve as a low-resolution image output from the training database generator.  This is taught in ¶ [163] and [211] above.  Later in the process, the down sampled image can be upscaled to convert the low-resolution image to a high-resolution image, which is taught in ¶ [49] and [50] above.).

Therefore, in view of Lee, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein generating the third image based on the first residual image and the second image comprises: up-sampling the first residual image in a model selection module of the second network, incorporated in the device of Gao, as modified by Cui, in order to select a model based on the type of image to acquire a feature map, which can improve the quality of images in a still or video (as stated in Lee ¶ [06]-[10]).   
However, the combination above fails to specifically teach the features of an LIIF module, and an image ensemble module of the second network.  
However, this is well known in the art as evidenced by Chen.  Similar to the primary reference, Chen discloses producing an image with LIIF (same field of endeavor or reasonably pertinent to the problem).     
Chen discloses an LIIF module, and an image ensemble module of the second network (e.g. the paper describes LIIF operating in a patch environment.  The MLP layers are used in the decoding function of the LIIF.  A local ensemble is described on pages 3 and 4 to predict an overall signal.  In section 3 in the Local Implicit Image Function section on pages 3 and 4, the LIIF function is described.  The patch-based method with the MLP layers is described on pages 5, 6 and 8.  These aspects incorporated in the second network of the primary reference performs the features of the claims.).
Therefore, in view of Chen, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of an LIIF module, and an image ensemble module of the second network, incorporated in the device of Gao, as modified by Cui and Lee, in order to utilize LIIF to generate a higher resolution image, which improves the feature of providing learning tasks with size-varied image ground truths (as stated in Chen Abstract and page 2).   
However, the combination above fails to specifically teach the features of adding the up-sampled first residual image to the second image to generate the third image.
However, this is well known in the art as evidenced by Liu.  Similar to the primary reference, Liu discloses back projection network for image super resolution (same field of endeavor or reasonably pertinent to the problem).     
Liu discloses adding the up-sampled first residual image to the second image to generate the third image (e.g. the system discloses adding an up-sampled residual image to the super resolution image to generate an updated super resolution image, which is taught on page 3 and illustrated in figure 2.).
Therefore, in view of Liu, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of adding the up-sampled first residual image to the second image to generate the third image, incorporated in the device of Gao, as modified by Lee and Chen, in order to perform a back projection approach to super resolution, which can minimize the super resolution residual information and allow for extraction of more compact features for reconstruction (as stated in Liu page 3).   


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Haris et al discloses Back Projection Networks for Super-resolution images.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHAD S DICKERSON whose telephone number is (571)270-1351. The examiner can normally be reached Monday-Friday 10AM-6PM EST..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abderrahim Merouan can be reached at 571-270-5254. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/CHAD DICKERSON/           Primary Examiner, Art Unit 2683
Read full office action
Prosecution Timeline

Feb 05, 2024
Application Filed
May 14, 2026
Non-Final Rejection mailed — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/177,878
Patent 12641187
IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM
3y 2m to grant Granted May 26, 2026
18/355,594
Patent 12639850
OBJECT POSE ESTIMATION IN THE CONTEXT OF NEURAL NETWORKS
2y 10m to grant Granted May 26, 2026
17/939,170
Patent 12620085
IMAGING SYSTEM AND METHOD TO ESTIMATE CONTOUR OF A SCANNED OBJECT
3y 8m to grant Granted May 05, 2026
17/683,166
Patent 12612046
VEHICLE DRIVABLE AREA DETECTION SYSTEM
4y 2m to grant Granted Apr 28, 2026
18/186,809
Patent 12614257
SYSTEMS AND METHODS FOR CLUTTER ARTIFACT REMOVAL USING A DATA-DRIVEN MODEL TRAINED ON IMAGES FORMED USING A LOWER CLUTTER IMAGE SEQUENCE AND A HIGHER CLUTTER IMAGE SEQUENCE GENERATED BY AN ARTIFACT OVERLAY ON THE LOWER CLUTTER IMAGE SEQUENCE
3y 1m to grant Granted Apr 28, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
63%
Grant Probability
86%
With Interview (+23.0%)
3y 2m (~10m remaining)
Median Time to Grant
Low
PTA Risk
Based on 600 resolved cases by this examiner. Grant probability derived from career allowance rate.