Last updated: April 19, 2026
Application No. 18/049,897
IMAGE ENHANCEMENT FOR IMAGE REGIONS OF INTEREST

Final Rejection §103
Filed
Oct 26, 2022
Examiner
SATCHER, DION JOHN
Art Unit
2676
Tech Center
2600 — Communications
Assignee
Qualcomm Incorporated
OA Round
4 (Final)
Interview Optional

— +14.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 39 resolved cases, 2023–2026
Examiner Intelligence

SATCHER, DION JOHN View full profile →
Grants 85% — above average
Career Allow Rate
33 granted / 39 resolved
+22.6% vs TC avg
Moderate +14% lift
Without
With
+14.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
29 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
14.2%
-25.8% vs TC avg
§103
61.9%
+21.9% vs TC avg
§102
15.1%
-24.9% vs TC avg
§112
8.3%
-31.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 39 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/03/2025 has been entered.
Response to Amendment
Applicant’s Amendments filed on 12/03/2025 has been entered and made of record. 
Currently pending Claim(s):
Independent Claim(s): 
Amended Claim(s):
Cancelled Claim(s):
1–7, 9–19, 21–29 and 31–32
1, 13 and 25
1, 3, 5, 11, 13, 15, 17 and 25
8, 20 and 30

Response to Applicant’s Arguments
This office action is responsive to Applicant’s Arguments/Remarks Made in an Amendment received on 12/03/2025.
In view of applicant Arguments/Remarks and amendment filed on 12/03/2025  with respect to independent claims 1, 13 and 25 under 35 U.S.C 103, claim rejection has been fully considered and the arguments are found to be not persuasive (See Pages 10–13), therefore the claim rejection with respect to 35 U.S.C. 103 still applies.
Applicant argues, in summary the applied prior art (Chu) does not disclose or suggest (see pages 10–13):
“adjusting a focus of a lens of an image sensor to increase the sharpness of the first ROI and decrease a sharpness of the second ROI for an additional image”
	However, the Examiner respectfully disagrees and believes that the applicant is mischaracterizing the reference. The Examiner has thoroughly reviewed the Applicant’s arguments but respectfully believe that the cited reference to reasonably and properly meet the claimed invention.
	Chu cites in the reference that it focuses on a first portion and a defocuses or blurs a second portion. The examiner is interpreting the first and second portions as regions of interest See Chu, ¶ [0064], “The difference in sharpness of edges of objects in the first region 301 and the blurred edges of the defocused objects in the second region 303 enables the background differentiation methods disclosed herein”. The applicant argues “Chu is changing the focus to increase the ROI in the foreground to improve background removal, …, The background is not a region of interest and is being removed in Chu”. Chu is specifically focusing on the objects in the first region 301 and defocusing on the objection in the second region 303. The examiner is interpreting these regions as the regions of interest. Chu is teaching the specific method of focusing on one ROI and defocusing on a second ROI to take an additional image. Chu specifically acquires video of the first and second portion after the focusing and defocusing See Chu, ¶ [0010], “(c) acquiring video data of the physical environment including the first portion and the second portion” which the Examiner is using this to teach “obtaining the additional image based on adjusting the focus of the lens”. Chu discloses that this method can be used for an embodiment of the invention for background removal. The Examiner notes that the Embodiment is an optional step and the Examiner is using Chu to acquire an image before the supposed background removal. See Chu ¶ [0065], “Using the aperture of the camera device 200 to differentiate the background portion from the foreground portion beneficially reduces the computing power that would otherwise be required for background removal and/or replacement methods, such as the method set forth in FIG. 4”. Examiner also notes that Chu teaches the new limitation “determining a sharpness of the second ROI at a distance from the second ROI to the focal point”. Specifically Chu looks at the difference in sharpness of the edges of object in the first region and second region. See Chu, ¶ [0064], “The difference in sharpness of edges of objects in the first region 301 and the blurred edges of the defocused objects in the second region 303 enables the background differentiation methods disclosed herein”. Which implies determining the sharpness for the first region and the second region as you would need to know the sharpness in order to determine the difference.
	Therefore, with this broad interpretation, Chu in combination with Xiang, Xie and Ma teaches, discloses or suggests the Applicant’s invention of determining two regions of interest associated with objects and determine the sharpness of the regions; adjust the focus of the lens so that the first region is sharp and the second region is unsharp; obtain an additional image; detect keypoints associated with first ROI; transform a portion of the image to align with keypoints and perform upscaling on the transformed portion of the image. Thus, due to Applicant’s broad claim language, Applicant’s invention is not far removed from the art of record. Accordingly, these limitations do not render claim patentably distinct over the prior art of record. As a result, it is respectfully submitted that the present application is not in condition for allowance.
	Thus, the Examiner maintains that limitations as presented and as rejected were properly and adequately met. The rejection as presented in the non-final rejection is maintained regarding to the above limitation. Additional citations and/or modified citations may be present to more concisely address limitations. However, the grounds of rejection remain the same.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.
Claim(s) 1–2, 4, 9–11, 13–14, 16, 21–23, 25–26, 28 and 31–32 are rejected under 35 U.S.C. 103 as being unpatentable over Xiang et al. (US 2023/0377095 A1, hereafter, “Xiang”) in view of Xie et al. (CN 112669207 A, hereafter, “Xie”) further in view of Ma et al. (See NPL attached, “Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation”, hereafter, “Ma”) and further in view of Chu et al. (US 2022/0256116 A1, hereafter, “Chu”).
Regarding claim 1, Xiang teaches a method of processing one or more images (See Xiang, ¶ [0015], the method 100 may be performed to produce an enhanced image or images), comprising: 
determining a first region of interest (ROI) in an image, wherein the first ROI is associated with a first object (See Xiang, ¶ [0017], Some examples of object detection may correlate patches of the image with an object template (e.g., face, text, vehicle, other object image) or templates to determine a matching object region or regions (e.g., region(s) of interest (ROI) and/or bounding box(es)) where the object(s) are located in the image); 
[determining a sharpness of the first ROI at a distance from the first ROI to a focal point;] 
detecting a second ROI in the image (See Xiang, ¶ [0010], In some examples of the techniques described herein, a region or regions that include an object or objects (e.g., faces, vehicles, and/or text, etc.) may be detected. Note: The examiner is interpreting the multiple regions as having a second ROI);
[determining a sharpness of the second ROI at a distance from the second ROI to the focal point;
adjusting a focus of a lens of an image sensor to increase the sharpness of the first ROI and decrease a sharpness of the second ROI for an additional image;
obtaining the additional image based on adjusting the focus of the lens;
detecting keypoints associated with the first ROI of the additional image, wherein the first ROI comprises a face of a person; 
transforming a portion of image data to align with the detected keypoints associated with the first ROI; 
determining to perform an upsampling process on the transformed portion of the image data in the first ROI based on the sharpness of the first ROI and upsampling the transformed portion of the image data].
However, Xiang fail(s) to teach determining a sharpness of the first ROI at a distance from the first ROI to a focal point; determining a sharpness of the second ROI at a distance from the second ROI to the focal point; adjusting a focus of a lens of an image sensor to increase the sharpness of the first ROI and decrease a sharpness of the second ROI for an additional image; obtaining the additional image based on adjusting the focus of the lens; detecting keypoints associated with the first ROI of the additional image, wherein the first ROI comprises a face of a person; transforming a portion of image data to align with the detected keypoints associated with the first ROI; determining to perform an upsampling process on the transformed portion of the image data in the first ROI based on the sharpness of the first ROI and upsampling the transformed portion of the image data.
Xie, working in the same field of endeavor, teaches: determining a sharpness of the first ROI at a distance from the first ROI to a focal point (See Xie, ¶ [0019], Step1 to judge other face resolution enhancement mode; in the picture of the presence of human face, when the illumination intensity of the ambient light (lx < 50) or the distance (l <280cm) from the face to the camera head, opening the face resolution enhancement mode. ¶ [0022], Step4, the human face image quality evaluation; using Tencel gradient method to evaluate the detected human face; Tender d gradient method uses Sobel operator respectively calculate gradient in horizontal and vertical direction, gradient value is higher, image is clear. defining the human face image of the gradient value (g is less than 10.0) as the image to be processed needing the resolution enhancement processing. Note: Examiner is interpreting g as the sharpness. Note that the higher the gradient the clearer an image is); 
based on the sharpness of the first ROI; and upsampling the transformed portion of the image data (See Xie, ¶ [0023], Step5 super-resolution enhancement processing; the fuzzy image in Step4 as the input of the super resolution network, through network reasoning, combining the difference image with the low resolution image, generating the high resolution image satisfying the requirement. Note: the super-resolution is being interpreted as up sampling and Xie determines a distance threshold and sharpness threshold before doing image super-resolution).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference to determining a sharpness of the first ROI at a distance from the first ROI to a focal point; based on the sharpness of the first ROI and upsampling the transformed portion of the image data based on the method of Xie’s reference. The suggestion/motivation would have been to efficiently enhance blurry and distant faces for facial recognition (See Xie, ¶ [0002–0003 and 0016]).
However, Xiang and Xie fail(s) to teach determining a sharpness of the second ROI at a distance from the second ROI to the focal point; adjusting a focus of a lens of an image sensor to increase the sharpness of the first ROI and decrease a sharpness of the second ROI for an additional image; obtaining the additional image based on adjusting the focus of the lens; detecting keypoints associated with the first ROI of the additional image, wherein the first ROI comprises a face of a person; transforming a portion of image data to align with the detected keypoints associated with the first ROI; determining to perform an upsampling process on the transformed portion of the image data in the first ROI.
Ma, working in the same field of endeavor, teaches: detecting keypoints associated with the first ROI, wherein the first ROI comprises a face of a person (See Ma, [3.1. Deep Iterative Collaboration], Similarly, the face alignment branch utilizes the recurrent features from the previous step                         
                            
                                    f
                                
                                    n
                                    -
                                    1
                                
                                            A
                                        
                                            R
                                        
                    and the SR features extracted by                         
                            
                                    A
                                
                                    1
                                
                     from the SR images                         
                            
                                    I
                                
                                    n
                                
                                    S
                                    R
                                
                     as the guidance for estimating landmarks more accurately); 
transforming a portion of image data to align with the detected keypoints associated with the first ROI (See Ma, [3.1. Deep Iterative Collaboration], Similarly, the face alignment branch utilizes the recurrent features from the previous step                         
                            
                                    f
                                
                                    n
                                    -
                                    1
                                
                                            A
                                        
                                            R
                                        
                    and the SR features extracted by                         
                            
                                    A
                                
                                    1
                                
                     from the SR images                         
                            
                                    I
                                
                                    n
                                
                                    S
                                    R
                                
                     as the guidance for estimating landmarks more accurately. See also [Figure 2], Note: Examiner is interpreting the facial alignment module as aligning the keypoints. Ma processes a low resolution image and processes it using super resolution then iteratively uses that SR image to align the facial landmarks of the low resolution image for a better super resolution in the next step); 
determining to perform an upsampling process on the transformed portion of the image data in the first ROI (See Ma, [3.1. Deep Iterative Collaboration], Given an LR (e.g., low resolution) face image                         
                            
                                    I
                                
                                    L
                                    R
                                
                    , facial landmarks are important for the recovery procedure, …, Similar to the SR branch, the recurrent alignment branch includes a pre-processing block                         
                            
                                    A
                                
                                    1
                                
                    , a recursive hourglass block                         
                            
                                    A
                                
                                    R
                                
                     and a post-processing block                         
                            
                                    A
                                
                                    2
                                
                    . For the nth step where                         
                            n
                            =
                            1
                            ,
                             
                            .
                             
                            .
                             
                            .
                             
                            ,
                             
                            N
                        
                    , the SR branch recovers SR images                         
                            
                                    I
                                
                                    n
                                
                                    S
                                    R
                                
                     by using the alignment results and the feedback information from the previous step                         
                            n
                            -
                            1
                        
                    , denoted as                         
                            
                                    L
                                
                                    n
                                    -
                                    1
                                
                     and                         
                            
                                    f
                                
                                    n
                                    -
                                    1
                                
                                            G
                                        
                                            R
                                        
                    , respectively. Besides, LR inputs are also important in each step. Hence LR features extracted by G1 are also fed into the recursive block. Therefore, the face SR process can be formulated by:

    PNG
    media_image1.png
    58
    338
    media_image1.png
    Greyscale

where U denotes an upsampling operation. Note: the examiner is interpreting the previous super resolution landmarks and features as the transformed portion. Ma processes a low resolution image and processes it using super resolution then iteratively uses that SR image to align the facial landmarks of the low resolution image for a better super resolution in the next step).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference to detecting keypoints associated with the first ROI, wherein the first ROI comprises a face of a person; transforming a portion of image data to align with the detected keypoints associated with the first ROI; determining to perform an upsampling process on the transformed portion of the image data in the first ROI based on the method of Ma’s reference. The suggestion/motivation would have been to more accurately align the face using fewer parameters (See Ma, [Study of Iterative Learning]).
However, Xiang, Xie and Ma fail(s) to teach determining a sharpness of the second ROI at a distance from the second ROI to the focal point; adjusting a focus of a lens of an image sensor to increase the sharpness of the first ROI and decrease a sharpness of the second ROI for an additional image; obtaining the additional image based on adjusting the focus of the lens.
Chu, working in the same field of endeavor, teaches: determining a sharpness of the second ROI at a distance from the second ROI to the focal point (See Chu, ¶ [0064], The difference in sharpness of edges of objects in the first region 301 and the blurred edges of the defocused objects in the second region 303. Note: to determine the difference in sharpness the sharpness of the two portions has to be known); 
adjusting a focus of a lens of an image sensor to increase the sharpness of the first ROI and decrease a sharpness of the second ROI for an additional image (See Chu, ¶ [0064], The subject matter, e.g., the conference participant 302, located at or proximate to the plane of focus 304 in the first portion 301, remains in focus while the aperture 316a,b is adjusted to defocus or blur subject matter in the second portion. Note: Examiner is interpreting the adjusting a focus as adjusting the aperture 316a, b); 
obtaining the additional image based on adjusting the focus of the lens (See Chu, ¶ [0010], (c) acquiring video data of the physical environment including the first portion and the second portion).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference to determining a sharpness of the second ROI at a distance from the second ROI to the focal point; adjusting a focus of a lens of an image sensor to increase the sharpness of the first ROI and decrease a sharpness of the second ROI for an additional image; obtaining the additional image based on adjusting the focus of the lens based on the method of Chu’s reference. The suggestion/motivation would have been to include a subject in the depth of field to focus more on the subject focus less on another subject and decrease the computational consumption (See Chu, ¶ [0065 and 0006]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Xie, Ma and Chu with Xiang to obtain the invention as specified in claim 1.
Regarding claim 2, Xiang in view of Xie further in view of Ma and further in view of Chu teaches the method of claim 1, [wherein determining to perform the upsampling process on the image data in the first ROI based on the sharpness being greater than a sharpness threshold].
However, Xiang, Ma and Chu fail(s) to teach wherein determining to perform the upsampling process on the image data in the first ROI based on the sharpness being greater than a sharpness threshold.
Xie, working in the same field of endeavor, teaches: wherein determining to perform the upsampling process on the image data in the first ROI based on the sharpness being greater than a sharpness threshold (See Xie, ¶ [0019], Step1 to judge other face resolution enhancement mode; in the picture of the presence of human face, when the illumination intensity of the ambient light (lx < 50) or the distance (l <280cm) from the face to the camera head, opening the face resolution enhancement mode. ¶ [0022], Step4, the human face image quality evaluation; using Tencel gradient method to evaluate the detected human face; Tender d gradient method uses Sobel operator respectively calculate gradient in horizontal and vertical direction, gradient value is higher, image is clear. defining the human face image of the gradient value (g is less than 10.0) as the image to be processed needing the resolution enhancement processing. Note: Examiner is interpreting g as the sharpness. Note that the higher the gradient the more clear an image is); and 
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference wherein determining to perform the upsampling process on the image data in the first ROI based on the sharpness being greater than a sharpness threshold based on the method of Xie’s reference. The suggestion/motivation would have been to efficiently enhance blurry and distant faces for facial recognition (See Xie, ¶ [0002–0003 and 0016]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Xie with Xiang, Ma and Chu to obtain the invention as specified in claim 2.
Regarding claim 4, Xiang in view of Xie further in view of Ma and further in view of Chu teaches the method of claim 1, further comprising: 
[determining the distance of the first ROI from the focal point is greater than a threshold distance; and 
determining to perform the upsampling process on the image data in the first ROI based on the distance of the first ROI from the focal point being greater than the threshold distance].
However, Xiang, Ma and Chu fail(s) to teach determining the distance of the first ROI from the focal point is greater than a threshold distance; and determining to perform the upsampling process on the image data in the first ROI based on the distance of the first ROI from the focal point being greater than the threshold distance.
Xie, working in the same field of endeavor, teaches: determining the distance of the first ROI from the focal point is greater than a threshold distance (See Xie, ¶ [0019], Step1 to judge other face resolution enhancement mode; in the picture of the presence of human face, when the illumination intensity of the ambient light (lx < 50) or the distance (l <280cm) from the face to the camera head, opening the face resolution enhancement mode); and 
determining to perform the upsampling process on the image data in the first ROI based on the distance of the first ROI from the focal point being greater than the threshold distance (See Xie, ¶ [0023], Step5 super-resolution enhancement processing; the fuzzy image in Step4 as the input of the super resolution network, through network reasoning, combining the difference image with the low resolution image, generating the high resolution image satisfying the requirement. Note: the super-resolution is being interpreted as up sampling and Xie determines a distance threshold and sharpness threshold before doing image super-resolution).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference to determining the distance of the first ROI from the focal point is greater than a threshold distance; and determining to perform the upsampling process on the image data in the first ROI based on the distance of the first ROI from the focal point being greater than the threshold distance based on the method of Xie’s reference. The suggestion/motivation would have been to efficiently enhance blurry and distant faces for facial recognition (See Xie, ¶ [0002–0003 and 0016]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Xie with Xiang, Ma and Chu to obtain the invention as specified in claim 4.
Regarding claim 9, Xiang in view of Xie further in view of Ma and further in view of Chu teaches the method of claim 1, further comprising: [based on transforming the portion of the image, obtaining an output image from a machine learning (ML) model trained to increase a resolution and enhance the face at least in part by inputting the first ROI].
However, Xiang, Xie and Chu fail(s) to teach based on transforming the portion of the image, obtaining an output image from a machine learning (ML) model trained to increase a resolution and enhance the face at least in part by inputting the first ROI.
Ma, working in the same field of endeavor, teaches: based on transforming the portion of the image, obtaining an output image from a machine learning (ML) model trained to increase a resolution and enhance the face at least in part by inputting the first ROI (See Ma, [3.1. Deep Iterative Collaboration], Given an LR (e.g., low resolution) face image                         
                            
                                    I
                                
                                    L
                                    R
                                
                    , facial landmarks are important for the recovery procedure, …, Similar to the SR branch, the recurrent alignment branch includes a pre-processing block                         
                            
                                    A
                                
                                    1
                                
                    , a recursive hourglass block                         
                            
                                    A
                                
                                    R
                                
                     and a post-processing block                         
                            
                                    A
                                
                                    2
                                
                    . For the nth step where                         
                            n
                            =
                            1
                            ,
                             
                            .
                             
                            .
                             
                            .
                             
                            ,
                             
                            N
                        
                    , the SR branch recovers SR images                         
                            
                                    I
                                
                                    n
                                
                                    S
                                    R
                                
                     by using the alignment results and the feedback information from the previous step                         
                            n
                            -
                            1
                        
                    , denoted as                         
                            
                                    L
                                
                                    n
                                    -
                                    1
                                
                     and                         
                            
                                    f
                                
                                    n
                                    -
                                    1
                                
                                            G
                                        
                                            R
                                        
                    , respectively. Besides, LR inputs are also important in each step. Hence LR features extracted by G1 are also fed into the recursive block. Therefore, the face SR process can be formulated by:

    PNG
    media_image1.png
    58
    338
    media_image1.png
    Greyscale

where U denotes an upsampling operation. Note: the examiner is interpreting the previous super resolution landmarks and features as the transformed portion. Ma processes a low resolution image and processes it using super resolution then iteratively uses that SR image to align the facial landmarks of the low resolution image for a better super resolution in the next step).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference based on transforming the portion of the image, obtaining an output image from a machine learning (ML) model trained to increase a resolution and enhance the face at least in part by inputting the first ROI based on the method of Ma’s reference. The suggestion/motivation would have been to more accurately align the face using fewer parameters (See Ma, [Study of Iterative Learning]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Ma with Xiang, Xie and Chu to obtain the invention as specified in claim 9.
Regarding claim 10, Xiang teaches the method of claim 9, further comprising: superimposing the output image on an upsampled version of the image (See Xiang, ¶ [0013], Some examples of the techniques described herein may utilize a machine learning model or models (e.g., deep learning) to increase image resolution and/or quality. For instance, some techniques may be utilized to generate super-resolution images with increased object (e.g., face) rendering quality. ¶ [0042], In some examples, the apparatus may blend the enhanced object region(s) with the enhanced background region(s)).
Regarding claim 11, Xiang in view of Xie further in view of Ma and further in view of Chu teaches the method of claim 1, further comprising: 
[determining a sharpness differential between the first ROI and the second ROI].
However, Xiang, Xie and Ma fail(s) to teach determining a sharpness differential between the first ROI and the second ROI.
Chu, working in the same field of endeavor, teaches: determining a sharpness differential between the first ROI and the second ROI (See Chu, ¶ [0064], The difference in sharpness of edges of objects in the first region 301 and the blurred edges of the defocused objects in the second region 303 enables the background differentiation methods disclosed herein); 
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference to determining a sharpness differential between the first ROI and the second ROI based on the method of Chu’s reference. The suggestion/motivation would have been to include a subject in the depth of field to focus more on the subject focus less on another subject and decrease the computational consumption (See Chu, ¶ [0065 and 0006]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Chu with Xiang, Xie and Ma to obtain the invention as specified in claim 11.
Regarding claim 13, claim 13 is rejected the same as claim 1 and the arguments similar to that presented above for claim 1 are equally applicable to the claim 13, and all of the other limitations similar to claim 1 are not repeated herein, but incorporated by reference. Furthermore, Xiang teaches an apparatus for processing one or more images, comprising: at least one memory; and at least one processor coupled with the at least one memory and configured to (See Xiang, ¶ [0015], the method 100 may be performed to produce an enhanced image or images. [FIG. 3], 328 Processor, 326 Memory, 324 Apparatus).
Regarding claim 14, claim 14 is rejected the same as claim 2 and the arguments similar to that presented above for claim 2 are equally applicable to the claim 14, and all of the other limitations similar to claim 2 are not repeated herein, but incorporated by reference. 
Regarding claim 16, claim 16 is rejected the same as claim 4 and the arguments similar to that presented above for claim 4 are equally applicable to the claim 16, and all of the other limitations similar to claim 4 are not repeated herein, but incorporated by reference. 
Regarding claim 21, claim 21 is rejected the same as claim 9 and the arguments similar to that presented above for claim 9 are equally applicable to the claim 21, and all of the other limitations similar to claim 9 are not repeated herein, but incorporated by reference. 
Regarding claim 22, claim 22 is rejected the same as claim 10 and the arguments similar to that presented above for claim 10 are equally applicable to the claim 22, and all of the other limitations similar to claim 10 are not repeated herein, but incorporated by reference. 
Regarding claim 23, claim 23 is rejected the same as claim 11 and the arguments similar to that presented above for claim 11 are equally applicable to the claim 23, and all of the other limitations similar to claim 11 are not repeated herein, but incorporated by reference. 
Regarding claim 25, claim 25 is rejected the same as claim 1 and the arguments similar to that presented above for claim 1 are equally applicable to the claim 25, and all of the other limitations similar to claim 1 are not repeated herein, but incorporated by reference. Furthermore, Xiang teaches a non-transitory computer-readable medium having stored thereon instructions that, when executed by at least one processor, cause the at least one processor to (See Xiang, ¶ [0054], In some examples, the memory 326 may be a non-transitory tangible machine-readable storage medium. [FIG. 3], 328 Processor, 326 Memory, 324 Apparatus).
Regarding claim 26, claim 26 is rejected the same as claim 2 and the arguments similar to that presented above for claim 2 are equally applicable to the claim 26, and all of the other limitations similar to claim 2 are not repeated herein, but incorporated by reference. 
Regarding claim 28, claim 28 is rejected the same as claim 4 and the arguments similar to that presented above for claim 4 are equally applicable to the claim 28, and all of the other limitations similar to claim 4 are not repeated herein, but incorporated by reference. 
Regarding claim 31, claim 31 is rejected the same as claim 9 and the arguments similar to that presented above for claim 9 are equally applicable to the claim 31, and all of the other limitations similar to claim 9 are not repeated herein, but incorporated by reference. 
Regarding claim 32, claim 32 is rejected the same as claim 10 and the arguments similar to that presented above for claim 10 are equally applicable to the claim 32, and all of the other limitations similar to claim 10 are not repeated herein, but incorporated by reference. 
Claim(s) 3, 15, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Xiang et al. (US 2023/0377095 A1, hereafter, “Xiang”) in view of Xie et al. (CN 112669207 A, hereafter, “Xie”) further in view of Ma et al. (See NPL attached, “Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation”, hereafter, “Ma”) further in view of Chu et al. (US 2022/0256116 A1, hereafter, “Chu”) and further in view of Zhang et al. (US 2020/0320352 A1, hereafter, “Zhang”).
Regarding claim 3, Xiang in view of Xie further in view of Ma and further in view of Chu teaches the method of claim 1, further comprising: 
[determining a size of the second ROI is less than a size threshold; and 
determining to perform an upsampling process on image data in the second ROI based on the size of the second ROI being less than the size threshold].
However, Xiang, Xie, Ma and Chu fail(s) to teach determining a size of the second ROI is less than a size threshold; and determining to perform an upsampling process on image data in the second ROI based on the size of the second ROI being less than the size threshold.
Zhang, working in the same field of endeavor, teaches: determining a size of the second ROI is less than a size threshold (See Zhang, ¶ [0107], wherein the potential recognition region includes a region with a designated content and a size no greater than a preset threshold); and 
determining to perform an upsampling process on image data in the second ROI based on the size of the second ROI being less than the size threshold (See Zhang, ¶ [0107], wherein the potential recognition region includes a region with a designated content and a size no greater than a preset threshold; determine an up-sampled potential recognition region by up-sampling the potential recognition region).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference to determining a size of the second ROI is less than a size threshold; and determining to perform an upsampling process on image data in the second ROI based on the size of the second ROI being less than the size threshold based on the method of Zhang’s reference. The suggestion/motivation would have been to improve success rate of recognition of small objects (See Zhang, ¶ [0065]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Zhang with Xiang, Xie, Ma and Chu to obtain the invention as specified in claim 3.
Regarding claim 15, claim 15 is rejected the same as claim 3 and the arguments similar to that presented above for claim 3 are equally applicable to the claim 15, and all of the other limitations similar to claim 3 are not repeated herein, but incorporated by reference. 
Regarding claim 27, claim 27 is rejected the same as claim 3 and the arguments similar to that presented above for claim 3 are equally applicable to the claim 27, and all of the other limitations similar to claim 3 are not repeated herein, but incorporated by reference. 
Claim(s) 5–7, 16–19, 28 and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Xiang et al. (US 2023/0377095 A1, hereafter, “Xiang”) in view of Xie et al. (CN 112669207 A, hereafter, “Xie”) further in view of Ma et al. (See NPL attached, “Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation”, hereafter, “Ma”) further in view of Chu et al. (US 2022/0256116 A1, hereafter, “Chu”) and further in view of Stephan et al. (US 2009/0060273, hereafter, “Stephan”).
Regarding claim 5, Xiang in view of Xie further in view of Ma and further in view of Chu teaches the method of claim 1, further comprising: 
determining one or more image characteristics of the second ROI (See Xiang, ¶ [0025], An identity feature is a value or values (e.g., vector(s)) that relate to an object identity (e.g., object type, object instance, and/or identifying facial characteristics of a person, etc.)); and 
[determining not to perform the upsampling process on image data in the second ROI based on the one or more image characteristics of the second ROI].
However, Xiang, Xie, Ma and Chu fail(s) to teach determining not to perform the upsampling process on image data in the second ROI based on the one or more image characteristics of the second ROI.
Stephan, working in the same field of endeavor, teaches: determining not to perform the upsampling process on image data in the second ROI based on the one or more image characteristics of the second ROI (See Stephan, ¶ [0070], At step 912, the distance d retrieved from the objectlist is compared to the reference distance 𝒅𝒅𝒆𝒇, If d is less than or equal to 𝒅𝒅𝒆𝒇, at step 914, the portion of the image data is upsampled by an upsampling factor 𝒔𝒇𝒖𝒑 that may be determined, e.g., as explained with reference to Equation (1) above. If d is larger than 𝒅𝒅𝒆𝒇, at step 916, the portion of the image data is downsampled by a downsampling factor sf down that may be determined, e.g., as explained with reference to Equation (2) above). 
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference to determining not to perform the upsampling process on image data in the second ROI based on the one or more image characteristics of the second ROI based on the method of Stephan’s reference. The suggestion/motivation would have been to provide results that are less prone to errors caused by variation in distance of objects (See Stephan, ¶ [0007]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Stephan with Xiang, Xie, Ma and Chu to obtain the invention as specified in claim 5.
Regarding claim 6, Xiang teaches the method of claim 5, wherein the first ROI and the second ROI are associated with a common object type (See Xiang, ¶ [0017], one examples of object detection may correlate patches of the image with an object template (e.g., face, text, vehicle, other object image) or templates to determine a matching object region or regions (e.g., region(s) of interest (ROI) and/or bounding box(es)) where the object(s) are located in the image).
Regarding claim 7, Xiang teaches the method of claim 6, wherein the common object type comprises a face region of a person (See Xiang, ¶ [0017], one examples of object detection may correlate patches of the image with an object template (e.g., face, text, vehicle, other object image) or templates to determine a matching object region or regions (e.g., region(s) of interest (ROI) and/or bounding box(es)) where the object(s) are located in the image).
Regarding claim 17, claim 17 is rejected the same as claim 5 and the arguments similar to that presented above for claim 5 are equally applicable to the claim 17, and all of the other limitations similar to claim 5 are not repeated herein, but incorporated by reference. 
Regarding claim 18, claim 18 is rejected the same as claim 6 and the arguments similar to that presented above for claim 6 are equally applicable to the claim 18, and all of the other limitations similar to claim 6 are not repeated herein, but incorporated by reference. 
Regarding claim 19, claim 19 is rejected the same as claim 7 and the arguments similar to that presented above for claim 7 are equally applicable to the claim 19, and all of the other limitations similar to claim 7 are not repeated herein, but incorporated by reference. 
Regarding claim 29, claim 29 is rejected the same as claim 5 and the arguments similar to that presented above for claim 5 are equally applicable to the claim 29, and all of the other limitations similar to claim 5 are not repeated herein, but incorporated by reference. 
Claim(s) 12 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Xiang et al. (US 2023/0377095 A1, hereafter, “Xiang”) in view of Xie et al. (CN 112669207 A, hereafter, “Xie”) further in view of Ma et al. (See NPL attached, “Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation”, hereafter, “Ma”) further in view of Chu et al. (US 2022/0256116 A1, hereafter, “Chu”) and further in view of Corcoran et al. (US 2009/0303342 A1, hereafter, “Corcoran”).
Regarding claim 12, Xiang in view of Xie further in view of Ma and further in view of Chu teaches the method of claim 1, further comprising: 
[resizing a bounding box associated with the first ROI; 
determining that the resized bounding box crops skin information based on a border region of the resized bounding box; and 
modifying the resized bounding box to include a region outside of the resized bounding box that corresponds to the skin information].
However, Xiang, Xie, Ma and Chu fail(s) to teach resizing a bounding box associated with the first ROI; determining that the resized bounding box crops skin information based on a border region of the resized bounding box; and modifying the resized bounding box to include a region outside of the resized bounding box that corresponds to the skin information.
Corcoran, working in the same field of endeavor, teaches: resizing a bounding box associated with the first ROI (See Corcoran, ¶ [0138], the tracking algorithm after inspecting the history record for this face region may first employ the next largest size of face detector, then the current size); 
determining that the resized bounding box crops skin information based on a border region of the resized bounding box (See Corcoran, ¶ [0138], If the face is still not confirmed then additional filters such as a skin pixel filter will try to determine if the face has turned to an angle, or has perhaps grown more than one size, or moved more than was expected and is thus outside the original bounding box which can then be enlarged); and 
modifying the resized bounding box to include a region outside of the resized bounding box that corresponds to the skin information (See Corcoran, ¶ [0138], If the face is still not confirmed then additional filters such as a skin pixel filter will try to determine if the face has turned to an angle, or has perhaps grown more than one size, or moved more than was expected and is thus outside the original bounding box which can then be enlarged).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Xiang’s reference to resizing a bounding box associated with the first ROI; determining that the resized bounding box crops skin information based on a border region of the resized bounding box; and modifying the resized bounding box to include a region outside of the resized bounding box that corresponds to the skin information based on the method of Corcoran’s reference. The suggestion/motivation would have been to track face region that are known to a high degree of accuracy at the beginning of each preview image frame and account for movement of tracked faces (See Corcoran, ¶ [0126 and 0139]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Corcoran with Xiang, Xie, Ma and Chu to obtain the invention as specified in claim 12.
Regarding claim 24, claim 24 is rejected the same as claim 12 and the arguments similar to that presented above for claim 12 are equally applicable to the claim 24, and all of the other limitations similar to claim 12 are not repeated herein, but incorporated by reference. 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Agrawal et al. (US 9965865 B1) teaches devices and techniques are generally described for segmentation of image data using depth data. In various examples, color image data may be received from a digital camera. In some examples, depth image data may be received from a depth sensor. In various examples, the depth image data may be separated into a plurality of clusters of depth image data, wherein each cluster is associated with a respective range of depth values. In some further examples, a determination may be made that a first cluster of image data corresponds to an object of interest, such as a human subject, in the image data. In various examples, pixels of the first cluster may be encoded with foreground indicator data. In some further examples, segmented image data may be generated. The segmented image data may comprise pixels encoded with the foreground indicator data.
Baldwin (US 9077891 B1) teaches a computing device can capture a plurality of images using a camera of the device, each image being captured with a different focus setting of the camera. In some embodiments, the capturing the plurality of images can be performed during an autofocus process of the camera. The device can determine depth information, such as a position of relative depth, for each of the plurality of images based on the state of the camera when each image was captured. Depth information for any object(s) in focus in a respective one of the plurality of images can be determined to correspond to the depth information for the respective image.
Goyal et al. (US 9723199 B1) teaches an imaging device may be configured to monitor a field of view for various objects or events occurring therein. The imaging device may capture a plurality of images at various focal lengths, identify a region of interest including one or more semantic objects therein, and determine measures of the levels of blur or sharpness within the regions of interest of the images. Based on their respective focal lengths and measures of their respective levels of blur or sharpness, a focal length for capturing subsequent images with sufficient clarity may be predicted. The imaging device may be adjusted to capture images at the predicted focal length, and such images may be captured. Feedback for further adjustments to the imaging device may be identified by determining measures of the levels of blur or sharpness within the subsequently captured images.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DION J SATCHER whose telephone number is (703)756-5849. The examiner can normally be reached Monday - Thursday 5:30 am - 2:30 pm, Friday 5:30 am - 9:30 am PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henok Shiferaw can be reached at (571) 272-4637. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DION J SATCHER/Patent Examiner, Art Unit 2676

/Henok Shiferaw/Supervisory Patent Examiner, Art Unit 2676
Read full office action
Prosecution Timeline

Oct 26, 2022
Application Filed
Dec 31, 2024
Non-Final Rejection — §103
Mar 06, 2025
Interview Requested
Mar 14, 2025
Examiner Interview Summary
Mar 14, 2025
Applicant Interview (Telephonic)
Apr 03, 2025
Response Filed
May 08, 2025
Final Rejection — §103
Jul 03, 2025
Interview Requested
Jul 10, 2025
Examiner Interview Summary
Jul 10, 2025
Applicant Interview (Telephonic)
Jul 14, 2025
Response after Non-Final Action
Jul 22, 2025
Request for Continued Examination
Jul 23, 2025
Response after Non-Final Action
Sep 02, 2025
Non-Final Rejection — §103
Dec 03, 2025
Response Filed
Feb 16, 2026
Final Rejection — §103
Apr 08, 2026
Examiner Interview Summary
Apr 08, 2026
Applicant Interview (Telephonic)
Precedent Cases

Applications granted by this same examiner with similar technology

18/119,435
Patent 12586218
MOTION ESTIMATION WITH ANATOMICAL INTEGRITY
2y 5m to grant Granted Mar 24, 2026
18/469,583
Patent 12579787
INSTRUMENT RECOGNITION METHOD BASED ON IMPROVED U2 NETWORK
2y 5m to grant Granted Mar 17, 2026
17/981,891
Patent 12573066
Depth Estimation Using a Single Near-Infrared Camera and Dot Illuminator
2y 5m to grant Granted Mar 10, 2026
18/063,819
Patent 12555263
SYSTEMS AND METHODS FOR TWO-STAGE OBJECTION DETECTION
2y 5m to grant Granted Feb 17, 2026
17/993,651
Patent 12548140
DETERMINING PROCESS DEVIATIONS THROUGH VIDEO ANALYSIS
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
85%
Grant Probability
99%
With Interview (+14.2%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 39 resolved cases by this examiner. Grant probability derived from career allow rate.
IMAGE ENHANCEMENT FOR IMAGE REGIONS OF INTEREST

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email