Last updated: April 19, 2026
Application No. 18/523,347
IMAGE SIGNAL PROCESSING OF STREAMED IMAGE DATA

Non-Final OA §102§103§112
Filed
Nov 29, 2023
Examiner
ZAK, JACQUELINE ROSE
Art Unit
2666
Tech Center
2600 — Communications
Assignee
Arm Limited
OA Round
1 (Non-Final)
Interview Optional

— -11.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 12 resolved cases, 2023–2026
Examiner Intelligence

ZAK, JACQUELINE ROSE View full profile →
Grants 67% — above average
Career Allow Rate
8 granted / 12 resolved
+4.7% vs TC avg
Minimal -11% lift
Without
With
+-11.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
46 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
5.7%
-34.3% vs TC avg
§103
56.3%
+16.3% vs TC avg
§102
21.1%
-18.9% vs TC avg
§112
13.8%
-26.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 12 resolved cases
Office Action

§102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
Claims 1-15 are pending for examination in the application filed 11/29/2023.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claim 12 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 12 recites the limitation "wherein in a case that consecutive horizontal lines change length and/or image data alignment due to a change in horizontal scaling following a change in alignment with the one or more regions of interest, the first spatial processing controls the first processed stream to contain two different versions of the same line or to contain different versions of a portion of the same line in order to provide image data for the image signal processing”. There is insufficient antecedent basis for “the same line” in the claim. It is unclear what the “horizontal lines” and “the same line” are. Please clarify.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-2, 5, and 13-15 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Kundu (US20250045873A1).

Regarding claim 1, Kundu teaches a method of processing streamed image data comprising: obtaining a stream of image data ([0005] Disclosed are systems, apparatuses, methods, and computer-readable media for performing foveated sensing. According to at least one example, a method is provided for generating one or more frames. The method includes: capturing, using an image sensor, sensor data for a frame associated with a scene); 
obtaining information identifying a location of one or more region of interest in the image data ([0005] determining a region of interest (ROI) associated with the scene); 
performing a first spatial processing on the stream of image data so as to change a spatial resolution of at least a portion of the streamed image data in dependence upon the location of the one or more region of interest to generate a first processed stream ([0005] generating a first portion of the frame corresponding to the ROI, the first portion having a first resolution; generating a second portion of the frame, the second portion having a second resolution that is lower than the first resolution); 
performing image signal processing on the first processed stream to generate a stream of processed image data ([0094] In some aspects, the post-processing engine 624 can process the salient portion of the frame and the peripheral portion of the frame to improve various aspects of the image data, such as color saturation, color balance, warping, and so forth. In some aspects, different parameters can be used for the salient and non-salient parts of the frame, resulting in different qualities for the different parts of the frame. For example, the front-end engine or the post-processing engine can perform sharpening on the salient portion of the frame to improve distinguishing edges); 
performing a second spatial processing on the stream of processed image data to generate a second processed stream of image data ([0096] In some aspects, the mask 616 can be provided to the post-processing engine 624 to improve image processing of the salient portion of the frame and the peripheral portion of the frame. After the salient portion of the frame and the peripheral portion of the frame are processed, the salient portion of the frame and the peripheral portion of the frame are provided to a blending engine 626 (e.g., a GPU) for blending the salient portion of the frame and the peripheral portion of the frame into a single output frame). [0103] The blending engine 722 may also be configured to perform various operations based on the mask. For example, a more sophisticated upscaling technique (e.g., bicubic) may be applied to the salient region, and a simpler upscaling technique (e.g., bilinear) may be applied to the peripheral region).
 
Regarding claim 2, Kundu teaches the method of claim 1. Kundu further teaches wherein performing second spatial processing generates a stream of image data corresponding to the one or more region of interest and a stream of data corresponding to an overall image, wherein the streams relating to the one or more region of interest and the overall image have different spatial resolutions ([0109] The post-processing engine 814 can read the salient region stream and the peripheral region stream in the memory 812 and process one or more of the streams…The post-processing engine 814 provides the processed frames to the blending engine 816 for blending the frames and other rendered content into a single frame, which is output to display panels of the XR system 800. The post-processing engine 814 also provides the processed frames to the ROI detection engine 808, which predicts a mask 806 for the next frame based on the processed frames and sensor information from various sensors. [0103] The blending engine 722 may also be configured to perform various operations based on the mask. For example, a more sophisticated upscaling technique (e.g., bicubic) may be applied to the salient region, and a simpler upscaling technique (e.g., bilinear) may be applied to the peripheral region. [0108] FIG. 8 illustrates an example of foveating a frame or image into salient portions and peripheral portions based on a mask 806 provided from an ROI detection engine 808 that detected the salient region (e.g., ROI) of a previous frame…The front-end engine 810 may downscale or downsample the peripheral region stream to conserve bandwidth).
 
Regarding claim 5, Kundu teaches the method of claim 1. Kundu further teaches wherein the first processed stream represents image data that is formed of a rectangular grid of pixel values ([0079] In some aspects, a mask (e.g., a binary or bitmap mask or image) can be used to indicate the ROI or salient region of a scene. For instance, a first value (e.g., a value of 1) for pixels in the mask can specify pixels within the ROI and a second value (e.g., a value of 0) for pixels in the mask can specify pixels in the peripheral region (outside of the ROI)).

Regarding claim 13, Kundu teaches the method of claim 1. Kundu further teaches wherein the image signal processing comprises one or more of: sharpening, noise reduction, contrast adjustment, colour correction, edge detection, focus detection, and colour space processing ([0094] In some aspects, the post-processing engine 624 can process the salient portion of the frame and the peripheral portion of the frame to improve various aspects of the image data, such as color saturation, color balance, warping, and so forth. In some aspects, different parameters can be used for the salient and non-salient parts of the frame, resulting in different qualities for the different parts of the frame. For example, the front-end engine or the post-processing engine can perform sharpening on the salient portion of the frame to improve distinguishing edges).

Regarding claim 14, Kundu teaches a device configured to process streamed image data comprising: one or more hardware units configured to: obtain a stream of image data ([0005] Disclosed are systems, apparatuses, methods, and computer-readable media for performing foveated sensing. According to at least one example, a method is provided for generating one or more frames. The method includes: capturing, using an image sensor, sensor data for a frame associated with a scene); 
obtain information identifying a location of one or more region of interest in the image data ([0005] determining a region of interest (ROI) associated with the scene); 
perform a first spatial processing on the stream of image data so as to change a spatial resolution of at least a portion of the streamed image data in dependence upon the location of the one or more region of interest to generate a first processed stream ([0005] generating a first portion of the frame corresponding to the ROI, the first portion having a first resolution; generating a second portion of the frame, the second portion having a second resolution that is lower than the first resolution); 
perform image signal processing on the first processed stream to generate a stream of processed image data ([0094] In some aspects, the post-processing engine 624 can process the salient portion of the frame and the peripheral portion of the frame to improve various aspects of the image data, such as color saturation, color balance, warping, and so forth. In some aspects, different parameters can be used for the salient and non-salient parts of the frame, resulting in different qualities for the different parts of the frame. For example, the front-end engine or the post-processing engine can perform sharpening on the salient portion of the frame to improve distinguishing edges); 
perform a second spatial processing on the stream of processed image data to generate a second processed stream of image data ([0096] In some aspects, the mask 616 can be provided to the post-processing engine 624 to improve image processing of the salient portion of the frame and the peripheral portion of the frame. After the salient portion of the frame and the peripheral portion of the frame are processed, the salient portion of the frame and the peripheral portion of the frame are provided to a blending engine 626 (e.g., a GPU) for blending the salient portion of the frame and the peripheral portion of the frame into a single output frame). [0103] The blending engine 722 may also be configured to perform various operations based on the mask. For example, a more sophisticated upscaling technique (e.g., bicubic) may be applied to the salient region, and a simpler upscaling technique (e.g., bilinear) may be applied to the peripheral region).

Regarding claim 15, Kundu teaches a non-transitory computer-readable storage medium storing instructions that, when executed by a processor of a device, cause the device to: obtain a stream of image data ([0005] Disclosed are systems, apparatuses, methods, and computer-readable media for performing foveated sensing. According to at least one example, a method is provided for generating one or more frames. The method includes: capturing, using an image sensor, sensor data for a frame associated with a scene); 
obtain information identifying a location of one or more region of interest in the image data ([0005] determining a region of interest (ROI) associated with the scene); 
perform a first spatial processing on the stream of image data so as to change a spatial resolution of at least a portion of the streamed image data in dependence upon the location of the one or more region of interest to generate a first processed stream ([0005] generating a first portion of the frame corresponding to the ROI, the first portion having a first resolution; generating a second portion of the frame, the second portion having a second resolution that is lower than the first resolution); 
perform image signal processing on the first processed stream to generate a stream of processed image data ([0094] In some aspects, the post-processing engine 624 can process the salient portion of the frame and the peripheral portion of the frame to improve various aspects of the image data, such as color saturation, color balance, warping, and so forth. In some aspects, different parameters can be used for the salient and non-salient parts of the frame, resulting in different qualities for the different parts of the frame. For example, the front-end engine or the post-processing engine can perform sharpening on the salient portion of the frame to improve distinguishing edges); 
perform a second spatial processing on the stream of processed image data to generate a second processed stream of image data ([0096] In some aspects, the mask 616 can be provided to the post-processing engine 624 to improve image processing of the salient portion of the frame and the peripheral portion of the frame. After the salient portion of the frame and the peripheral portion of the frame are processed, the salient portion of the frame and the peripheral portion of the frame are provided to a blending engine 626 (e.g., a GPU) for blending the salient portion of the frame and the peripheral portion of the frame into a single output frame). [0103] The blending engine 722 may also be configured to perform various operations based on the mask. For example, a more sophisticated upscaling technique (e.g., bicubic) may be applied to the salient region, and a simpler upscaling technique (e.g., bilinear) may be applied to the peripheral region).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 3-4 and 7-8 are rejected under 35 U.S.C. 103 as being unpatentable over Kundu in view of Duenyas (US20220070391A1). 

Regarding claim 3, Kundu teaches the method of claim 1. Kundu further teaches wherein the second spatial image processing processes the region of interest and areas outside of the region of interest to generate a single frame ([0096] In some aspects, the mask 616 can be provided to the post-processing engine 624 to improve image processing of the salient portion of the frame and the peripheral portion of the frame. After the salient portion of the frame and the peripheral portion of the frame are processed, the salient portion of the frame and the peripheral portion of the frame are provided to a blending engine 626 (e.g., a GPU) for blending the salient portion of the frame and the peripheral portion of the frame into a single output frame. [0103] The blending engine 722 may also be configured to perform various operations based on the mask. For example, a more sophisticated upscaling technique (e.g., bicubic) may be applied to the salient region, and a simpler upscaling technique (e.g., bilinear) may be applied to the peripheral region). 
Kundu does not teach generate a single stream of processed image data having a single resolution.
Duenyas, in the same field of endeavor of image spatial processing, teaches generate a single stream of processed image data having a single resolution ([0038] The processed data of the different data groups may then be organized in frames to be displayed (S312), which may be accomplished in several ways. In one example, a unified image may be created at the lowest resolution, unifying the ROIs and the remaining regions in their respective locations of the originally captured image. To create the unified image, the digital data group for each ROI can be digitally binned to generate pixel regions for display at the lowest resolution). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Kundu with the teachings of Duenyas to generate a single stream of image data having a single resolution because "This processing chain may align the quality of the analog and digitally binned regions to represent a uniform image and eliminate transitions in image quality between different regions of the image" [Duenyas 0038]. 
 
Regarding claim 4, Kundu and Duenyas teach the method of claim 3. Kundu further teaches wherein the first spatial processing comprises pixel binning performed by an image sensor on a portion of the stream of image data not included in the one or more regions of interest, wherein the second spatial processing comprises upscaling the portion of the of the stream of processed image data that was subjected to pixel binning ([0035] In some aspects, an image sensor can be configured to capture a part of a frame in high resolution, which is referred to as a foveated region or a region of interest (ROI), and other parts of the frame at a lower resolution using various techniques (e.g., pixel binning), which is referred to as a peripheral region. [0054] In one illustrative example where a 48 megapixel (48 MP or 48 M) image is captured by the image sensor 130 using a 2×2 quad color filter array 200, a 2×2 binning process can be performed to generate a 12 MP binned image. The reduced-resolution image can be upsampled (upscaled) to a higher resolution in some cases (e.g., before or after being processed by the ISP 154)).
 
Regarding claim 7, Kundu teaches the method of claim 1. Duenyas teaches wherein the method obtains a plurality of regions of interest, wherein the method performs the first spatial processing to change a spatial resolution of at least a portion of the streamed image data in dependence upon the positions of the region of interests to generate a first processed stream and the method performs the second spatial processing on the stream of processed image data to generate at least one stream associated with the plurality of regions of interest ([0009] The at least one processor executing instructions to: (i) obtain at least one frame of image data of a scene read out from the pixel array; (ii) identify a plurality of regions of interest (ROIs) within the at least one frame; (iii) obtain subsequent frames of the scene, which comprises controlling the pixel array to perform first resolution imaging with respect to a first group of the ROIs, second resolution imaging with respect to a second group of the ROIs, and third resolution imaging with respect to a background region of the frames outside the plurality of ROIs, wherein the first and second resolutions are different from each other, the third resolution is lower than each of the first and second resolutions, and each group of the ROIs comprises one or more ROIs; (iv) provide image data obtained from the pixel array in pipelines each corresponding to image data imaged using one of the first, second or third resolutions; and (v) digitally process each of the pipelines separately to provide at least first, second and third resolution groups of image data to be displayed on the display). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Kundu with the teachings of Duenyas to obtain a plurality of regions of interest to perform spatial processing because "Regions of interest (ROIs) of a captured scene may be assigned a lower analog binning factor (higher resolution) than remaining regions such as backgrounds constituting the majority of a scene, thereby reducing the size of processed data. This allows for: (i) reduced power consumption within the sensor; (ii) the performance of additional processing tasks on sensor as desired; and/or (iii) an increase in the frame rate without exceeding a power consumption budget" [Duenyas 0004]. 
 
Regarding claim 8, Kundu and Duenyas teach the method of claim 7. Duenyas teaches wherein the second spatial processing generates a separate stream associated with each region of interest and at least two of the generated streams have different spatial resolutions ([0009] wherein the first and second resolutions are different from each other, the third resolution is lower than each of the first and second resolutions, and each group of the ROIs comprises one or more ROIs; (iv) provide image data obtained from the pixel array in pipelines each corresponding to image data imaged using one of the first, second or third resolutions; and (v) digitally process each of the pipelines separately to provide at least first, second and third resolution groups of image data to be displayed on the display). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Kundu with the teachings of Duenyas to generate a separate stream associated with each region of interest "to reduce the size of data to be processed in an image sensor…This allows for: (i) reduced power consumption within the sensor; (ii) the performance of additional processing tasks on sensor as desired; and/or (iii) an increase in the frame rate without exceeding a power consumption budget" [Duenyas 0004]. 

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Kundu in view of Gopalakrishna (US20130083245A1). 

Regarding claim 6, Kundu teaches the method of claim 1. Gopalakrishna, in the same field of endeavor of image processing, teaches wherein the first processed stream is stored in a set of delay lines prior to performing image signal processing ([0070] In some embodiments the noise reduction apparatus comprises an input line delay 101. The input line delay 101 is configured to receive the original (ORG) input video signal on a line by line basis and can comprise any suitable means for delaying or storing a previous line or previous lines of input data. In some embodiments the input to the line delay 101 comprises either the input chroma or luma data. The input line delay 101 can in some embodiments be configured to output current, previous, and next line data to a spatial noise reduction apparatus 105, a spatial-temporal noise reducer blender 107, an edge adaptive threshold determiner 109, a motion detector 111, and a flesh tone detector 113). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Kundu with the teachings of Gopalakrishna to store the images as delay lines before the image signal processing because "The edge adaptor threshold determiner 109 can be configured to receive the line information from the original image (from the image line delay 101), and the previous frame or field image data (from the noise reduced line delay 103)…The edge adaptive threshold determiner 109 can be configured to generate an edge detection threshold value such that luma edge image portions can be detected" [Gopalakrishna 0082].

Claim 9-10 are rejected under 35 U.S.C. 103 as being unpatentable over Kundu in view of Riguer (US20210063741A1). 

Regarding claim 9, Kundu teaches the method of claim 1. Riguer, in the same field of endeavor of image spatial processing, teaches wherein the first spatial processing comprises: determining for each pixel of the stream of image data whether the pixel is horizontally aligned with the one or more regions of interest and: in a case that the pixel is not horizontally aligned with the one or more regions of interest performing a vertical scaling in relation to the pixel; and in a case that the pixel is horizontally aligned with the one or more regions of interest, not performing a vertical scaling in relation to the pixel ([0020] Scaling unit 220 receives rendered image 215 as well as foveal region information. In one implementation, scaling unit 220 converts the variable-sized regions in rendered image 215 into equi-sized regions in scaled image 225 by using different scale factors to scale the different variable-sized regions in rendered image 215. For example, in one implementation, scaling unit 210 maintains the original pixel density of the foveal region of rendered image 215 while scaling down the non-foveal regions of rendered image 215. [0021] If the original size of the given region is greater than the target size, then the given region will be downscaled (i.e., downsampled), which will cause each pixel value to be combined with one or more neighboring pixel values to produce a pixel value in the scaled version of the given region. If the original size of the given region is less than the target size, then the given region will be upscaled (i.e., expanded), which will cause each pixel value to be used in calculating the values of two or more pixels in the scaled version of the given region. [0019] The plurality of regions include a single foveal region and a plurality of non-foveal regions. In one implementation, the foveal region is a relatively smaller region than the non-foveal regions. In one implementation, the region scaling is matched to the acuity of the human visual system (HVS) and scaling within each region is driven by acuity. In other words, scaling increases as the distance from the foveal region increases. [0029] The scaling factors can be specified in different manners depending on the implementation. In one implementation, there is a scaling factor specified per region or per horizontal row and vertical column of regions. In another implementation, the scaling factor is specified using a formula that adjusts the amount of scaling based on the horizontal and vertical displacement from the foveal region). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Kundu with the teachings of Riguer to perform vertical scaling when the pixel is not aligned with a region of interest because "a foveated rendered image has a variable amount of pixel resolution that varies according to a distance from the foveal region of the image, with the pixel resolution or fidelity reduced as a distance from the foveal region increases" [Riguer 0032]. 

Regarding claim 10, Kundu and Riguer teach the method of claim 9. Riguer teaches wherein the first spatial processing comprises: determining for each pixel of the stream of image data whether the pixel is vertically aligned with the one or more regions of interest and: in a case that the pixel is not vertically aligned with the one or more regions of interest performing a horizontal scaling in relation to the pixel; and in a case that the pixel is vertically aligned with the one or more regions of interest not performing a horizontal scaling in relation to the pixel ([0020] Scaling unit 220 receives rendered image 215 as well as foveal region information. In one implementation, scaling unit 220 converts the variable-sized regions in rendered image 215 into equi-sized regions in scaled image 225 by using different scale factors to scale the different variable-sized regions in rendered image 215. For example, in one implementation, scaling unit 210 maintains the original pixel density of the foveal region of rendered image 215 while scaling down the non-foveal regions of rendered image 215. [0021] If the original size of the given region is greater than the target size, then the given region will be downscaled (i.e., downsampled), which will cause each pixel value to be combined with one or more neighboring pixel values to produce a pixel value in the scaled version of the given region. If the original size of the given region is less than the target size, then the given region will be upscaled (i.e., expanded), which will cause each pixel value to be used in calculating the values of two or more pixels in the scaled version of the given region. [0019] The plurality of regions include a single foveal region and a plurality of non-foveal regions. In one implementation, the foveal region is a relatively smaller region than the non-foveal regions. In one implementation, the region scaling is matched to the acuity of the human visual system (HVS) and scaling within each region is driven by acuity. In other words, scaling increases as the distance from the foveal region increases. [0029] The scaling factors can be specified in different manners depending on the implementation. In one implementation, there is a scaling factor specified per region or per horizontal row and vertical column of regions. In another implementation, the scaling factor is specified using a formula that adjusts the amount of scaling based on the horizontal and vertical displacement from the foveal region). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Kundu with the teachings of Riguer to perform horizontal scaling when the pixel is not aligned with a region of interest because "a foveated rendered image has a variable amount of pixel resolution that varies according to a distance from the foveal region of the image, with the pixel resolution or fidelity reduced as a distance from the foveal region increases" [Riguer 0032]. 
 
Claim 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Kundu in view of Riguer and Ribezzo and Rossi (Rossi is incorporated by reference in Ribezzo) (G. Ribezzo, L. De Cicco, V. Palmisano and S. Mascolo, "Bitrate Reduction for Omnidirectional Video Streaming: Comparing Variable Quantization Parameter and Variable Resolution Approaches," 2021 19th Mediterranean Communication and Computer Networking Conference (MedComNet), Ibiza, Spain, 2021, pp. 1-7; S. Rossi, C. Ozcinar, A. Smolic, and L. Toni, “Do users behave similarly in vr? investigation of the user influence on the system design,”ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 16, no. 2, pp. 1–26, 2020).

Regarding claim 11, Kundu and Riguer teach the method of claim 9. Ribezzo and Rossi, in the same field of endeavor of stream processing, in combination teach wherein the first spatial processing comprises: determining for each pixel of the stream of image data whether the pixel is located within the one or more regions of interest ([Ribezzo pg. 2 para. 8] In the RoI selection phase (marked with 1) an algorithm detects a higher interest area spanning 120° horizontally. The algorithm used to select the most interesting areas can be a general content-aware algorithm based on saliency map, such as the one described in [17]). Rossi ([17], incorporated by reference) teaches ([Rossi pg. 4 para 2] head movement determines field of view (FoV) as the pixel region of ODV to be seen by the HMD over time)
and: in a case that the pixel is not located within the one or more regions of interest performing a horizontal scaling in relation to the pixel; and in a case that the pixel is located within the one or more regions of interest not performing a horizontal scaling in relation to the pixel ([Ribezzo pg. 3 para. 4] The Variable RESolution (VRES) approach is shown in the right branch of Figure 1. In this case, the encoding phase requires that the two regions outside the RoI are shrunk horizontally from a resolution res0 to a lower resolution res1. [pg. 5 para. 3] Figure 3a which corresponds to the case in which VQP employs a ∆qp = 5 to encode the regions outside the RoI and VRES downscales the horizontal resolution of the regions outside the RoI to 480p). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Kundu with the teachings of Ribezzo and Rossi to horizontally scale the pixels outside of the ROI only "to deliver to the user a video having a maximal quality in the regions currently falling into the users' viewport, keeping the other regions with a lower quality (or not delivered at all in the extreme case)" [Ribezzo pg. 1 para. 3]. 

Regarding claim 12, Kundu, Riguer, Ribezzo and Rossi teach the method of claim 11. Riguer teaches wherein in a case that consecutive horizontal lines change length and/or image data alignment due to a change in horizontal scaling following a change in alignment with the one or more regions of interest, the first spatial processing controls the first processed stream to contain two different versions of the same line or to contain different versions of a portion of the same line in order to provide image data for the image signal processing ([0079] In some aspects, a mask (e.g., a binary or bitmap mask or image) can be used to indicate the ROI or salient region of a scene. For instance, a first value (e.g., a value of 1) for pixels in the mask can specify pixels within the ROI and a second value (e.g., a value of 0) for pixels in the mask can specify pixels in the peripheral region (outside of the ROI). In one illustrative example, the mask can include a first color (e.g., a black color) indicating a peripheral region (e.g., a region to crop from a high-resolution image) and a second color (e.g., a white color) indicating the ROI. In some cases, ROI can be a rectangular region (e.g., a bounding box) identified by the mask. In some cases, the ROI can be a non-rectangular region. For instance, instead of specifying a bounding box, the start and end pixels of each line (e.g., each line of pixels) in the mask can be programmed independently to specify whether the pixel is part of the ROI or outside of the ROI. [0109] In some cases, the salient region/ROI of the frame and a second stream including the peripheral region of the frame may need to be temporarily stored in the memory 812 until the images are required by the post-processing engine 814. In this example, the peripheral region consumes less memory based on the lower resolution, which saves energy by requiring the memory 812 to write less content and decreases bandwidth consumption. The post-processing engine 814 can read the salient region stream and the peripheral region stream in the memory 812 and process one or more of the streams).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Kundu with the teachings of Riguer to have two different versions in order to provide image data for processing because "The post-processing engine 814 can read the salient region stream and the peripheral region stream in the memory 812 and process one or more of the streams. In some cases, the post-processing engine 814 can use the mask to control various additional processing functions, such as edge detection, color saturation, noise reduction, tone mapping, etc. In some aspects, the post-processing engine 814 is more computationally expensive and providing a mask 806 to perform calculations based on a particular region can significantly reduce the processing cost of various corrective measures" [Riguer 0109]. 
 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Aartsen (US20230397896A1) teaches ROI image detection and spatial and signal processing based on the ROI. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jacqueline R Zak whose telephone number is (571)272-4077. The examiner can normally be reached M-F 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached at (571) 270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JACQUELINE R ZAK/Examiner, Art Unit 2666                                                                                                                                                                                                        
/EMILY C TERRELL/Supervisory Patent Examiner, Art Unit 2666
Read full office action
Prosecution Timeline

Nov 29, 2023
Application Filed
Jan 12, 2026
Non-Final Rejection — §102, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/175,738
Patent 12586340
PIXEL PERSPECTIVE ESTIMATION AND REFINEMENT IN AN IMAGE
2y 5m to grant Granted Mar 24, 2026
18/012,667
Patent 12462343
MEDICAL DIAGNOSTIC APPARATUS AND METHOD FOR EVALUATION OF PATHOLOGICAL CONDITIONS USING 3D OPTICAL COHERENCE TOMOGRAPHY DATA AND IMAGES
2y 5m to grant Granted Nov 04, 2025
17/924,432
Patent 12373946
ASSAY READING METHOD
2y 5m to grant Granted Jul 29, 2025
Study what changed to get past this examiner. Based on 3 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
67%
Grant Probability
55%
With Interview (-11.4%)
2y 10m
Median Time to Grant
Low
PTA Risk
Based on 12 resolved cases by this examiner. Grant probability derived from career allow rate.