Last updated: April 19, 2026
Application No. 18/364,156
EFFICIENT IMAGE-DATA PROCESSING

Non-Final OA §103
Filed
Aug 02, 2023
Examiner
SORRIN, AARON JOSEPH
Art Unit
2672
Tech Center
2600 — Communications
Assignee
Qualcomm Incorporated
OA Round
2 (Non-Final)
Interview Optional

— +50.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 62 resolved cases, 2023–2026
Examiner Intelligence

SORRIN, AARON JOSEPH View full profile →
Grants 74% — above average
Career Allow Rate
46 granted / 62 resolved
+12.2% vs TC avg
Strong +51% interview lift
Without
With
+50.6%
Interview Lift
resolved cases with interview
Typical timeline
3y 5m
Avg Prosecution
22 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
20.4%
-19.6% vs TC avg
§103
35.6%
-4.4% vs TC avg
§102
14.1%
-25.9% vs TC avg
§112
29.3%
-10.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 62 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Objections are withdrawn.
Claim interpretations under 35 USC 112(f) are maintained.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-6, 9, 12-20, 23, and 26-30 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fu (US20230036222A1) in view of Chen  (US20220198607A1).
Regarding claim 1, Fu teaches “An apparatus for processing data, the apparatus comprising: at least one memory; and at least one processor coupled to the at least one memory” (Fu, Paragraphs 17 and 42, “It will be appreciated that embodiments of the disclosure described herein may be comprised of one or more conventional processors and unique stored program instructions that control the one or more processors to implement, in conjunction with certain non-processor circuits, some, most, or all of the functions of generating a high resolution, enhanced image from a high resolution, low light image by downsampling the high resolution, low light image to obtain a low resolution, low light image that is input into a low light enhancement model of a deep neural network component while the high resolution, low light image, the low resolution, low light image, and an output of the low light enhancement model of the deep neural network component are all input into a mathematical joint upsampling model component as described herein.”; “The application processor and the auxiliary processor(s) can be operable with the various components of the electronic device 100. Each of the application processor and the auxiliary processor(s) can be configured to process and execute executable software code to perform the various functions of the electronic device 100. A storage device, such as memory 113, can optionally store the executable software code used by the one or more processors 112 during operation.”)

“and configured to: obtain first image data having a first resolution; downsample the first image data to generate downsampled first image data, wherein the downsampled first image data has a second resolution that is lower than the first resolution; process the downsampled first image data to generate processed downsampled first image data;” (Fu, Figure 4 element 401-403 and Paragraphs 84-86, “The method performed by the system 300 of FIG. 3 is summarized in FIG. 4 . Turning now to FIG. 4 , at step 401 an imager captures a high resolution, low light image. In one or more embodiments, the high resolution, low light image captured at step 401 has associated therewith a light level of less than ten lux, although the method 400 of FIG. 4 works for low light images having greater light levels as well. In one or more embodiments, the high resolution, low light image has a spatial resolution of 1440×1080 pixels, although the high resolution, low light image can have other spatial resolutions as well. At step 402, the high resolution, low light image is downsampled to obtain a low resolution, low light image. In one or more embodiments, the result of this downsampling is that the high resolution, low light image has a spatial resolution of between three and five times the spatial resolution of the low resolution, low light image. At step 403, the low resolution, low light image is processed by a low light enhancement model of a deep neural network to obtain a low resolution, enhanced image. In one or more embodiments, the low light enhancement model of the deep neural network is trained through deep learning of semantic enhancement for images photographed in a low light environment using a supervised learning process as previously described. In one or more embodiments, the low resolution, enhanced image includes richer semantics, examples of which include true colors, edges, and brightness level, than does the low resolution, low light image. In one or more embodiments, the low resolution, low light image and the low resolution, enhanced image have the same spatial resolution.”)

While Fu generates “upsampled image data based on the processed downsampled first image data and the first image data” (Fu, Figure 4 elements 404-406, and Paragraphs 87 and 90, “At step 404, the following inputs are delivered to a mathematical model, which in one embodiment is sharpness-preserving mathematical model: the high resolution, low light image, the low resolution, low light image, and the low resolution, enhanced image.” “At step 406, the sharpness-preserving mathematical model performs an upsampling process using its inputs by resizing the estimated α matrix and the estimated β matrix by resizing the estimated α matrix and the estimated β matrix to generate a scaled α matrix and a scaled β matrix each having a spatial resolution equal to that of the high resolution, low light image. Using these resized matrices, step 406 comprises generating the high resolution, enhanced image using the following equation for each pixel d(i,j) of the high resolution, enhanced image: d(i,j)=α_scaled(ij)*a(i,j)+β_scaled(i,j) (EQ. 2)”), Fu does not expressly disclose upsampling using a machine-learning model.
Chen discloses upsampling using a machine learning model based on an original and a downsampled image (Chen , Paragraph 150, “In some embodiments, a computer-implemented method for training a neural network to downsample images in a video encoding pipeline comprises executing a first convolutional neural network on a first source image having a first resolution to generate a first downsampled image, wherein the first convolutional neural network includes at least two residual blocks and is associated with a first downsampling factor, executing an upsampling algorithm on the first downsampled image to generate a first reconstructed image having the first resolution, computing a first reconstruction error based on the first reconstructed image and the first source image, and updating at least one parameter of the first convolutional neural network based on the first reconstruction error to generate a trained convolutional neural network.”)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to use the CNN of Chen for performing the upsampling of Fu.
The motivation for doing so would have been to provide a means of efficiently performing the upscaling and handling complex and vast amounts of image data. Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Fu with the above teaching of Chen to fully disclose, “generate, using a machine-learning model, upsampled image data based on the processed downsampled first image data and the first image data.”

Fu in view of Chen further disclose, “obtain second image data;” (Fu, Paragraph 14, “Before describing in detail embodiments that are in accordance with the present disclosure, it should be observed that the embodiments reside primarily in combinations of method steps and apparatus components related to processing high resolution, low light images with a hybrid process combining a low light enhancement model of a deep neural network trained through deep learning of semantic enhancement for images photographed in a low light environment with a sharpness-preserving mathematical model to optimize a maximum semantic recovery from the high resolution, low light images while using a minimum processor loading within a predefined low light enhancement process duration time. Any process descriptions or blocks in flow charts should be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process.” Note that “images” teaches the plurality of images (including a second image).)
“and generate processed second image data based on the upsampled image data and the second image data.” (Fu, Paragraph 51, “In one or more embodiments, the low light enhancement model of the deep neural network component 109 is trained using supervised learning in which the machine learning infers functions from a control and a conditioned input. For instance, ground truth images can be input as controls while identical, but low light images, are input as conditioned inputs. The low light enhancement model of the deep neural network component 109 may perform regressions and other processes on the inputs to predict output classifications or other functions to predict outputs. While supervised learning is a preferred method for training the low light enhancement model of the deep neural network component 109 in one or more embodiments, other techniques for training could be used as well, including unsupervised learning, semi-supervised learning, and reinforcement learning. Other training techniques will be obvious to those of ordinary skill in the art having the benefit of this disclosure.” Note that reinforcement learning uses previous data (the upsampled image data) to process the new data (second image data) to generate the processed second image data (as the second image data is run through the process of Fu).)

Regarding claim 2, Fu in view of Chen teach “The apparatus of claim 1,” 
“wherein, to process the downsampled image data, the at least one processor is configured to at least one of: apply one or more filters to the downsampled first image data; reduce noise in the downsampled first image data; sharpen the downsampled first image data; de-mosaic the downsampled first image data; or re-mosaic the downsampled first image data. (Fu, Paragraph 70, “The low light enhancement model of the deep neural network component 109 then processes the low resolution, low light image 302 to recover missing semantic details from the low resolution, low light image 302. Examples of such semantic details include true colors of the low resolution, low light image 302, edges of the low resolution, low light image 302, brightness levels of the low resolution, low light image 302, noise reduction in the low resolution, low light image 302, contrast in the low resolution, low light image 302, and so forth. Other examples of semantic information suitable for recovery using a low light enhancement model of the deep neural network component 109 will be obvious to those of ordinary skill in the art having the benefit of this disclosure. As a result of this processing, the low light enhancement model of the deep neural network component 109 generates a low resolution, enhanced image 303 with enhanced semantic details. In one or more embodiments, the spatial resolution of the low resolution, low light image 302 and the low resolution, enhanced image 303 are the same.” Note that the underlined limitations are the processes that are disclosed by Fu.)

Regarding claim 3, Fu in view of Chen teach “The apparatus of claim 1,” 
“wherein the machine-learning model is trained to receive the first image data having the first resolution and the second image data having the second resolution as inputs and to generate third image data having the first resolution. (Chen , Paragraph 250, “In some embodiments, a computer-implemented method for training a neural network to downsample images in a video encoding pipeline comprises executing a first convolutional neural network on a first source image having a first resolution to generate a first downsampled image, wherein the first convolutional neural network includes at least two residual blocks and is associated with a first downsampling factor, executing an upsampling algorithm on the first downsampled image to generate a first reconstructed image having the first resolution, computing a first reconstruction error based on the first reconstructed image and the first source image, and updating at least one parameter of the first convolutional neural network based on the first reconstruction error to generate a trained convolutional neural network.” Note that the above teaching was incorporated with motivation and rationale in the rejection of claim 1. Fu further supports that the output image matches the resolution of the initial image in Paragraph 71, “In so doing, this results in the high resolution, low light image 204 and the high resolution, enhanced image 305 having the same spatial resolution. If the high resolution, low light image 204 is 1440×1080 pixels, the high resolution, enhanced image 305 will have the same spatial resolutions, and so forth.”)

Regarding claim 4, Fu in view of Chen teach “The apparatus of claim 1,” 
“wherein the machine-learning model comprises a neural network comprising two or more convolutional layers.” (Chen , Paragraph 250, “In some embodiments, a computer-implemented method for training a neural network to downsample images in a video encoding pipeline comprises executing a first convolutional neural network on a first source image having a first resolution to generate a first downsampled image, wherein the first convolutional neural network includes at least two residual blocks and is associated with a first downsampling factor, executing an upsampling algorithm on the first downsampled image to generate a first reconstructed image having the first resolution, computing a first reconstruction error based on the first reconstructed image and the first source image, and updating at least one parameter of the first convolutional neural network based on the first reconstruction error to generate a trained convolutional neural network.” Note that the above teaching was incorporated with motivation and rationale in the rejection of claim 1, and residual blocks inherently contain convolutional layers.)

Regarding claim 5, Fu in view of Chen teach “The apparatus of claim 1,” 
“wherein the upsampled first image data has the first resolution.” (Fu, Paragraph 71, “A mathematical model 304, one example of which is the mathematical joint upsampling model component (110) of FIG. 1 , then generates a high resolution, enhanced image 305 from three separate inputs: the high resolution, low light image 204, the low resolution, low light image 302, and the low resolution, enhanced image 303. In one or more embodiments, the mathematical model 304 comprises a sharpness-preserving mathematical model that performs super-resolution tasks on the high resolution, low light image 204 using one or more transformational matrices calculated by the mathematical model 304 from the low resolution, low light image 302 and the low resolution, enhanced image 303. In one or more embodiments, this includes performing an upsampling process using the inputs when generating the high resolution, enhanced image 305, where the mathematical model 304 resizes these transformational matrices from those having spatial resolutions corresponding to the low resolution, low light image 302 and low resolution, enhanced image 303 to those having other spatial resolutions corresponding to the high resolution, low light image 204. In so doing, this results in the high resolution, low light image 204 and the high resolution, enhanced image 305 having the same spatial resolution. If the high resolution, low light image 204 is 1440×1080 pixels, the high resolution, enhanced image 305 will have the same spatial resolutions, and so forth.”)

Regarding claim 6, Fu in view of Chen teach “The apparatus of claim 1,” 
“wherein the first image data comprises raw image data captured by an image sensor of a device and wherein the downsampled first image data is processed by an image signal processor (ISP) of the device.” (Fu, Figure 1 elements 121 and 106 show the image sensors of the device for capturing raw image data, wherein the device includes the ISP (element 111). Paragraph 48 further explains, “The one or more processors 112 can employ the modules 116 or other instructional code to be operable with the imager 106, the second imager 121, and/or the image processor 111 as well. Illustrating by example, as will be explained below the one or more processors 112, the image processor 111, or combinations thereof can be instructed to generate a high resolution, enhanced image from a high resolution, low light image using a hybrid image enhancement technique by combining a low light enhancement model of a deep neural network component 109 trained through deep learning of semantic enhancement for images captured or photographed in low light environments with a mathematical joint upsampling model component 110. For instance, a downsampler 104 can downsample a high resolution, low light image to obtain a low resolution, low light image. The one or more processors 112 and/or image processor 111 can input the low resolution, low light image into the low light enhancement model of the deep neural network component 109 to generate a low resolution, enhanced image. The one or more processors 112 and/or image processor 111 can then input each of the high resolution, low light image, the low resolution, low light image, and the low resolution, enhanced image into the mathematical joint upsampling model component for processing.”)

Regarding claim 9, Fu in view of Chen teach “The apparatus of claim 1,” 	
“wherein the at least one processor is further configured to at least one of display, store, or transmit the upsampled image data.” (Fu, Paragraph 60, “For instance, when the downsampler 104 downsamples the high resolution, low light image by three to five times to generate the low resolution, low light image, the mathematical joint upsampling model component 110 will calculate an estimated two two-dimensional transformational matrices α and β that have the same spatial resolution as do the low resolution, low light image and, when generated by the low light enhancement model of the deep neural network component 109, the low resolution, enhanced image. The mathematical joint upsampling model component 110 then resizes the α matrix and β matrix to have the same spatial resolution as does the high resolution, low light image. The one or more processors 112 and/or image processor 111 then use the mathematical joint upsampling model component 110 to apply the α matrix and the β matrix after scaling to original pixels of the high resolution, low light image to generate the high resolution, enhanced image. In one or more embodiments, the one or more processors 112 and/or image processor 111 then present the high resolution, enhanced image on a display to a user, examples of which include the display 105 and/or the second display 120.”)

Regarding claim 12, Fu in view of Chen teach “The apparatus of claim 1,” 
“wherein: the first image data comprises a first frame of video data; and the second image data comprises a second frame of the video data.” (Fu, Paragraph 67, “As shown in FIG. 2 , the user 200 is causing the imager (106) to capture the one or more images 201 by touching a user actuation target presented on the second display 120 of the electronic device 100. While this is one mechanism by which the imager (106) will capture the light signals in its field of view to capture the one or more 202 by converting those light signals into operable data, it should be noted that the imager (106) may capture images in other ways as well. Illustrating by example, when the imager (106) is operating in a preview mode as a viewfinder for the user 200, the imager (106) may capture consecutive images at a rate of 15 Hertz, 30 Hertz, 60 Hertz, or another rate for presentation on a display, e.g., the second display 120, of the electronic device 100. Alternatively, the imager (106) may capture successive images when recording video as well. Accordingly “capture” of an image as used herein means a conversion of light signals to operable data suitable for storage in a storage device of the electronic device 100, such as memory (113).”)	

Regarding claim 13, Fu in view of Chen teach “The apparatus of claim 1,”
“wherein the at least one processor is further configured to: obtain third image data; and process the third image data to generate processed third image data;” (Fu, Paragraph 14, “Before describing in detail embodiments that are in accordance with the present disclosure, it should be observed that the embodiments reside primarily in combinations of method steps and apparatus components related to processing high resolution, low light images with a hybrid process combining a low light enhancement model of a deep neural network trained through deep learning of semantic enhancement for images photographed in a low light environment with a sharpness-preserving mathematical model to optimize a maximum semantic recovery from the high resolution, low light images while using a minimum processor loading within a predefined low light enhancement process duration time. Any process descriptions or blocks in flow charts should be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process.” Note that “images” teaches that a plurality of images, including a third image, are obtained, and the plurality of images are processed, generating a processed third image data.)
“wherein the processed second image data is generated further based on the processed third image data.” (Fu, Paragraph 51 (see rejection of claim 1) describes reinforcement learning,  wherein previously collected and processed images are used in training for processing of subsequently collected images. In a case where the third image data is processed prior to the processing of the second image data, the conditions of this limitation are met. The processed third image data is used during reinforcement training of the model that is then used for processing second image data.)

Regarding claim 14, Fu in view of Chen teach “The apparatus of claim 1,” 
The apparatus recited in claim 14 is analogous to the apparatus recited in claim 1, but for the recitation of a second image data having the first resolution and a downsampled second image data having the second resolution. (Fu teaches that the apparatus is used for a plurality of images undergoing identical processing: Paragraph 67, “As shown in FIG. 2 , the user 200 is causing the imager (106) to capture the one or more images 201 by touching a user actuation target presented on the second display 120 of the electronic device 100. While this is one mechanism by which the imager (106) will capture the light signals in its field of view to capture the one or more 202 by converting those light signals into operable data, it should be noted that the imager (106) may capture images in other ways as well. Illustrating by example, when the imager (106) is operating in a preview mode as a viewfinder for the user 200, the imager (106) may capture consecutive images at a rate of 15 Hertz, 30 Hertz, 60 Hertz, or another rate for presentation on a display, e.g., the second display 120, of the electronic device 100. Alternatively, the imager (106) may capture successive images when recording video as well. Accordingly “capture” of an image as used herein means a conversion of light signals to operable data suitable for storage in a storage device of the electronic device 100, such as memory (113).”) Accordingly, Fu in view of Chen fully disclose the invention of claim 14.

Regarding claims 15-20, 23, and 26-28, these claims recite a method with steps corresponding to the elements of the system recited in Claims 1-6, 9, and 12-14. Therefore, the recited steps of these claims are mapped to the analogous elements in the corresponding system claims. Additionally, the rationale and motivation to combine the Fu and Chen references apply to these claims.

Regarding claim 29, claim 29 recites a non-transitory computer-readable storage medium having stored thereon instructions corresponding to the elements of the system recited in claim 1. Therefore, the recited programming instructions of this claim are mapped to the analogous elements in the corresponding system claim. Additionally, the rationale and motivation to combine the Fu and Chen references apply to this claim. The combination of Fu and Chen disclose a non-transitory computer-readable storage medium having stored thereon instructions (Fu, Figure 1 (element 113) and Paragraph 42, “The application processor and the auxiliary processor(s) can be operable with the various components of the electronic device 100. Each of the application processor and the auxiliary processor(s) can be configured to process and execute executable software code to perform the various functions of the electronic device 100. A storage device, such as memory 113, can optionally store the executable software code used by the one or more processors 112 during operation.”)

Regarding claim 30, claim 30 recites an apparatus with elements corresponding to the elements of the system recited in claim 1. Therefore, the recited elements of this claim are mapped to the analogous elements in the corresponding system claim. Additionally, the rationale and motivation to combine the Fu and Chen references apply to this claim.

Claim(s) 7, 8 and 21, and 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Chen further in view of Zhu (US 20220058774 A1).

Regarding claim 7, Fu in view of Chen teach “The apparatus of claim 1,” 
While Fu in view of Chen disclose providing image data and downsampled data as inputs to the machine-learning model (see combination statement in the rejection of claim 1), they do not expressly disclose a step of rearranging the image data and providing the rearranged image data as inputs to the machine learning model.
Zhu teaches rearranging image data and using the rearranged image data as part of neural network processing (Zhu, Figure 7 and Paragraph 44, “Processes may be implemented on computing platforms such as those discussed further above with respect to FIGS. 1 and 2 to perform image enhancement using s2d and d2s operations in accordance with embodiments of the invention. For example, memory on a computing device can include an image enhancement application and parameters of a neural network. A processor or processing system on the computing device can include a hardware accelerator capable of implementing the neural network with a spatial resolution (e.g., height and width) and number of channels. The processor or processing system can be configured by the image enhancement application to implement the neural network and perform processes for image enhancement. A process in accordance with embodiments of the invention is illustrated in FIG. 7. The process 700 includes receiving an image and providing (710) at least a portion of the input image to an input layer of the neural network, where the input layer has initial spatial dimensions and an initial number of channels.”)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to perform the rearranging of Zhu on the image data of Fu in view of Chen for inputting into the machine learning model.
The motivation for doing so would have been to improve computational speed, as described in Paragraph 40 of Zhu, “Typically, the purpose in utilizing s2d is to perform lossless downsampling to reduce the spatial extent of NN layers without losing spatial information. In a number of embodiments of the invention, however, the use of the s2d operation serves to increase the depth/channel processing performed by the NN hardware acceleration to fully utilize the channel counts optimally supported by the hardware acceleration platform without incurring computational latency due to channel-wise parallel processing. In many embodiments, the s2d operation also provides the additional benefit of spatial extent reduction which further improves inference computation speed as the convolutional kernels are required to raster over fewer spatial pixels, ultimately enabling processing of more images for a given time duration (e.g. frames per second in a video sequence) or larger numbers of pixels for each image.”. Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Fu in view of Chen with the above teaching of Zhu to fully disclose, “wherein the at least one processor is further configured to: rearrange the first image data to generate rearranged first image data; and provide the rearranged first image data and the downsampled first image data as inputs to the machine-learning model.”

Regarding claim 8, Fu in view of Chen further in view of Zhu teach “The apparatus of claim 7,” 
“wherein, to rearrange the first image data, the at least one processor is configured to rearrange the first image data as a tensor with dimensions related to dimensions of the downsampled first image data.” (Zhu, Figure 7 and Paragraph 44, “Processes may be implemented on computing platforms such as those discussed further above with respect to FIGS. 1 and 2 to perform image enhancement using s2d and d2s operations in accordance with embodiments of the invention. For example, memory on a computing device can include an image enhancement application and parameters of a neural network. A processor or processing system on the computing device can include a hardware accelerator capable of implementing the neural network with a spatial resolution (e.g., height and width) and number of channels. The processor or processing system can be configured by the image enhancement application to implement the neural network and perform processes for image enhancement. A process in accordance with embodiments of the invention is illustrated in FIG. 7. The process 700 includes receiving an image and providing (710) at least a portion of the input image to an input layer of the neural network, where the input layer has initial spatial dimensions and an initial number of channels.” Note that s2d and d2s operations amount to tensor manipulation. Additionally, the dimensions are inherently ‘related to’ the dimensions of the downsampled image, as the downsampled image is generated from the first image data. The above teaching of Zhu was incorporated with rationale and motivation in the rejection of claim 7.)

Regarding claims 21 and 22, these claims recites a method with steps corresponding to the elements of the system recited in claims 7 and 8. Therefore, the recited steps of these claims are mapped to the analogous elements in the corresponding system claims. Additionally, the rationale and motivation to combine the Fu, Chen, and Zhu references apply to these claims.

Claim(s) and 10 and 24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Chen further in view of Powell (US 20230027452 A1).

Regarding claim 10 Fu in view of Chen teach “The apparatus of claim 1,” 
While Fu in view of Chen disclose downsampling (see rejection of claim 1), they do not expressly disclose that downsampling is performed via binning.
Powell discloses performing downsampling via binning (Powell, Paragraph 5, “Embodiments provide novel technical approaches for efficient high-resolution output of an image captured using a high-pixel-count image sensor based on pixel binning followed by luminance-guided upsampling. For example, an image sensor array is configured according to a red-green-blue-luminance (RGBL) CFA pattern, such that at least 50-percent of the imaging pixels of the array are luminance (L) pixels. Pixel binning is used during readout of the array to concurrently generate a downsampled RGB capture frame and a downsampled L capture frame. Following the readout, the L capture frame is upsampled (e.g., by upscaling and interpolation) to generate an L guide frame with 100-percent luminance density. An upsampled RGB frame can then be generated by interpolating the RGB capture frame based both on known neighboring RGB information (e.g., from the RGB capture frame and previously interpolated information), as adjusted based on local luminance information from the L guide frame.”)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to use the binning of Powell for performing the downsampling of Fu in view of Chen.
The motivation for doing so would have been to reduce noise and mitigate outliers. Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Fu in view of Chen with the above teaching of Powell to fully disclose, “wherein, to downsample the first image data, the at least one processor is configured to bin the first image data.”

Regarding claim 24, this claim recites a method with steps corresponding to the elements of the system recited in claim 10. Therefore, the recited steps of this claim are mapped to the analogous elements in the corresponding system claim. Additionally, the rationale and motivation to combine the Fu, Chen, and Powell references apply to these claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AARON JOSEPH SORRIN whose telephone number is (703)756-1565. The examiner can normally be reached Monday - Friday 9am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached at (571) 272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/AARON JOSEPH SORRIN/Examiner, Art Unit 2672



/SUMATI LEFKOWITZ/Supervisory Patent Examiner, Art Unit 2672
Read full office action
Prosecution Timeline

Aug 02, 2023
Application Filed
Sep 25, 2025
Non-Final Rejection — §103
Dec 11, 2025
Response Filed
Feb 17, 2026
Non-Final Rejection — §103
Mar 31, 2026
Interview Requested
Apr 16, 2026
Examiner Interview Summary
Precedent Cases

Applications granted by this same examiner with similar technology

17/999,467
Patent 12592054
LOW-LIGHT VIDEO PROCESSING METHOD, DEVICE AND STORAGE MEDIUM
2y 5m to grant Granted Mar 31, 2026
18/183,423
Patent 12586245
ROBUST LIDAR-TO-CAMERA SENSOR ALIGNMENT
2y 5m to grant Granted Mar 24, 2026
17/756,401
Patent 12566954
SOLVING MULTIPLE TASKS SIMULTANEOUSLY USING CAPSULE NEURAL NETWORKS
2y 5m to grant Granted Mar 03, 2026
18/060,645
Patent 12555394
IMAGE PROCESSING APPARATUS, METHOD, AND STORAGE MEDIUM FOR GENERATING DATA BASED ON A CAPTURED IMAGE
2y 5m to grant Granted Feb 17, 2026
17/809,781
Patent 12547658
RETRIEVING DIGITAL IMAGES IN RESPONSE TO SEARCH QUERIES FOR SEARCH-DRIVEN IMAGE EDITING
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

2-3
Expected OA Rounds
74%
Grant Probability
99%
With Interview (+50.6%)
3y 5m
Median Time to Grant
Moderate
PTA Risk
Based on 62 resolved cases by this examiner. Grant probability derived from career allow rate.