Last updated: April 18, 2026
Application No. 17/994,659
APPARATUS AND METHOD WITH IMAGE PROCESSING

Non-Final OA §103§112
Filed
Nov 28, 2022
Examiner
FELIX, BRADLEY OBAS
Art Unit
2671
Tech Center
2600 — Communications
Assignee
Samsung Electronics Co., Ltd.
OA Round
3 (Non-Final)
This examiner grants 12% of cases after interview

— +66.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 17 resolved cases, 2023–2026
Examiner Intelligence

FELIX, BRADLEY OBAS View full profile →
Grants only 12% of cases
Career Allow Rate
2 granted / 17 resolved
-50.2% vs TC avg
Strong +67% interview lift
Without
With
+66.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 6m
Avg Prosecution
29 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
8.5%
-31.5% vs TC avg
§103
62.9%
+22.9% vs TC avg
§102
14.3%
-25.7% vs TC avg
§112
14.3%
-25.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 17 resolved cases
Office Action

§103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Application has amended claims 1 and 19-20. Thus, application has pending claims 1-20.

Response to Arguments
Applicant’s arguments with respect to claims 1 and 19-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Thus, the new reference of Shao, in combination with LIAO, and separately with KIM, meets each of the new limitations of the claims as disclosed in the rejection below. Therefore, this action is made NON-FINAL.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 1 and 19-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. According to newly amended claim 1, and similarly in claims 19-20, it discloses “fusing an upsampled feature map generated based on one or more scaled images of the first image with a feature map generated based on the one or more scaled images”. According to the Application’s Specification (¶73-75) and Drawings (FIG. 4), filed 11/28/2022, it would not be reasonable for one skilled in the art to use only one scaled image. Additionally, the feature map based on the one or more scaled images would need to use a different scaled image than the upsampled feature map. Appropriate correction is required.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1 and 19-20 recite the limitation “detect a target region in the first image, based on the feature map” in the Claims pages 2 and 7, respectively. It is unclear whether the feature map is referring to the generated feature map of a first image, or the feature map generated based on the one or more scaled images. Appropriate correction is required. For purposes for examination, the feature map is interpreted to be the generated feature map of the first image.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dan-ping LIAO CN-113255682-B, hereinafter LIAO, in further view of Xiaotao Shao et al., Multi-Scale Feature Pyramid Network.

As per claim 20, LIAO teaches an apparatus with image processing, the apparatus comprising:a processor configured to (see LIAO page 4/24, wherein a processor is disclosed):generate a feature map of a first image and detect a target region in the first image, based on the feature map (see LIAO top of page 3/24, wherein the feature map with the area of the target is disclosed);correct the detected target region (see LIAO top of page 3/24, wherein the detection module further corrects the target candidate area);and process an object corresponding to the target region, based on the corrected target region (see LIAO top of page 3/24, wherein the detection module, after correcting, obtains the final position of the detection target. See further bottom of page 3/24, wherein target detection of the object type is also disclosed).
However, LIAO fails to explicitly disclose where Shao teaches:generate a feature map of a first image by fusing an upsampled feature map generated based on one or more scaled images of the first image with a feature map generated based on the one or more scaled images (see Shao page 4/15 and FIGS. 1-2, wherein feature maps, such as                         
                            
                                    f
                                
                                    o
                                    u
                                    t
                                    _
                                    2
                                
                    , is generated by the fusion of the upscaled feature map                         
                            
                                    f
                                
                                    u
                                    p
                                    _
                                    3
                                
                     and a feature map                         
                            
                                    f
                                
                                    i
                                    n
                                    _
                                    2
                                
                     based off the scaled input image, similar to Application’s Drawing FIG. 4).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify LIAO’s apparatus by using Shao’s teaching by including the fusion of an upsampled feature map and another feature map to the final feature map generation in order to improve detection and localization accuracy.

As per claim 1, the rationale provided in claim 20 is incorporated herein. In addition, the method of claim 1 corresponds to the apparatus of claim 20.

As per claim 18, LIAO teaches A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, configured the processor to perform the method of claim 1 (see LIAO page 4/24, wherein the computer medium that contains computer program is disclosed).

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over JAEMIN KIM KR-20200119369-A, hereinafter KIM, in further view of YING Shao, CN-109740448-A.

As per claim 19, KIM discloses a processor-implemented method with image processing, the method comprising: generating a feature map of a first image by performing, using a convolutional neural network, a convolution operation on the first image using a convolution kernel corresponding to each of one or more positions in the first image (see KIM bottom half of page 5/29, wherein an image is input into a convolutional neural network and the feature extractor extracts features of the image through filtering, i.e., generating the feature map, such as the position of the object); and processing an object in the first image, based on the feature map of the first image (see KIM middle of page 6/29, wherein the features of the objects are processed by the preprocessor).
However, KIM fails to explicitly disclose where Shao teaches:generate a feature map of a first image by fusing an upsampled feature map generated based on one or more scaled images of the first image with a feature map generated based on the one or more scaled images (see Shao page 4/15 and FIGS. 1-2, wherein feature maps, such as                         
                            
                                    f
                                
                                    o
                                    u
                                    t
                                    _
                                    2
                                
                    , is generated by the fusion of the upscaled feature map                         
                            
                                    f
                                
                                    u
                                    p
                                    _
                                    3
                                
                     and a feature map                         
                            
                                    f
                                
                                    i
                                    n
                                    _
                                    2
                                
                     based off the scaled input image, similar to Application’s Drawing FIG. 4).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify KIM’s method by using Shao’s teaching by including the fusion of an upsampled feature map and another feature map to the final feature map generation in order to improve detection and localization accuracy.

Claim 2-4 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over LIAO, in combination with Shao, in further view of KIM.

As per claim 2, LIAO, in combination with Shao, fails to explicitly disclose where KIM teaches the method of claim 1, wherein the generating of the feature map of the first image comprises generating the one or more feature maps of the first image by extracting a feature of the first image from the one or more scaled images of the first image (see KIM page 5/29, wherein the feature extractor extracts the features of an image through down-sampling of the input image, i.e., scaled images. See also FIG. 5, wherein the filtering of the convolution layers, i.e., feature maps, are disclosed); and the detecting of the target region in the first image comprises detecting a target region in the first image based on the one or more feature maps (see KIM top of page 8/29, wherein the area of the object is acquired from the merged features, i.e., feature maps, of the image).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify LIAO’s, in combination with Shao, method by using KIM’s teaching by including the scaled images to the generated feature maps in order to more specifically extract features of a particular region of the image.

As per claim 3, LIAO, in combination with Shao and KIM, teaches the method of claim 2, wherein the generating of the one or more feature maps of the first image by extracting the feature of the first image from the one or more scaled images comprises generating a feature map at each of one or more scales by performing a convolution operation on each of the one or more scaled images with a convolutional neural network (see KIM bottom of page 5/29 and FIG. 5, wherein the feature extraction unit performs filtering on the convolution layers as it down-samples, i.e., generating a feature map on the scaled images), and for each of the one or more scaled images of the first image, the convolutional neural network is configured to perform the convolution operation on each of one or more positions on the scaled image by using a convolution kernel corresponding to each of the one or more positions (see KIM top of page 6/29 and FIG. 5, the YOLO, i.e., convolutional network, convolves each of the convolutional layers using filters having a size of 3x3, i.e., convolution kernel. See also bottom of page 5/29, wherein the position of the vehicle, or object, is acquired from feature extraction).

As per claim 4, LIAO, in combination with Shao and KIM, teaches The method of claim 3, wherein the generating of the feature map at each of the one or more scales by performing the convolution operation on each of the one or more scaled images with the convolutional neural network comprises: determining a sampling position of the convolution kernel corresponding to each of the one or more positions on the one or more scaled images (see KIM top of page 6/29 and FIG. 5, wherein the pooling layer convolved using a 3x3 filter for down-sampling is disclosed); and generating the feature map at each of the one or more scales by performing the convolution operation according to the sampling position of the convolution kernel corresponding to each of the one or more positions (see KIM top of page 6/29 and FIG. 5, wherein the pooling layer down-samples to a 208x208 filtered image, i.e., feature map).

As per claim 6, LIAO, in combination with Shao and KIM, teaches the method of claim 2, wherein the one or more feature maps comprises a plurality of feature maps, and the detecting of the target region in the first image based on the one or more feature maps comprises fusing feature maps of adjacent scales in the plurality of feature maps and detecting a target region in the first image based on one or more fused feature maps (see KIM top of page 8/29, wherein the feature layers are merged and the bounding box coordinates around the object are obtained using the merged features). 

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over LIAO, in combination with Shao and KIM, in further view of Jia GUO CN-113435260-A, hereinafter GUO.

As per claim 5, LIAO, in combination Shao and KIM, fails to explicitly disclose where GUO teaches The method of claim 4, wherein, for each of the one or more scaled images of the first image, the determining of the sampling position of the convolution kernel corresponding to each of the one or more positions on the scaled image comprises:determining the sampling position of the convolution kernel corresponding to each of the one or more positions in a three-dimensional (3D) space according to an imaging model of the first image (see GUO page 7/45, wherein the sample position of the image in the three-dimensional kernel is disclosed. Space information is also acquired); and determining the sampling position of the convolutional kernel corresponding to each of the one or more positions in the scaled image, according to the sampling position of the convolutional kernel in the 3D space and the imaging model (see GUO page 7/45, wherein the sample position and space information in the three-dimensional kernel is disclosed. See further bottom of page 8, wherein the down-sampling layer and target positioning is disclosed).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify LIAO’s, in combination with KIM, method by using GUO’s teaching by including the sampling position to the scaled images in order to further acquire the coordinates of the region of the image.
Claims 7, 9, and 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over LIAO, in combination with Shao, in further view of HONG-JUN WANG CN-113312973-A, hereinafter WANG.
As per claim 7, LIAO, in combination with Shao, fails to explicitly disclose where WANG teaches the method of claim 1, wherein the correcting of the detected target region comprises:determining a first feature region corresponding to the detected target region in the feature map of the first image to be a first target region feature map (see WANG bottom of page 5/26, wherein the candidate frame is acquired from the feature extraction of the image); and generating a transformed first target region feature map by spatially transforming the first target region feature map (see WANG page 6/26, wherein the ROIAlign transforms the feature map by changing the area into a uniform size), and the processing of the object corresponding to the target region based on the corrected target region comprises processing the object corresponding to the target region based on the transformed first target region feature map (see WANG page bottom of 6/26, wherein the segmentation of the target, i.e., the target object, is obtained after the transformation).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify LIAO’s, in combination with Shao, method by using WANG’s teaching by including a first feature region and transforming the target region to the correction of the target region in order to further adjust and correct the distortions to acquire an accurate target region.
As per claim 9, LIAO, in combination with Shao and WANG, discloses the method of claim 7, wherein the processing of the object corresponding to the target region based on the transformed first target region feature map comprises: generating first attribute information of the object corresponding to the target region, based on the transformed first target region feature map (see WANG page 6/26, wherein the ROIAlign obtains the object shape mask and the key points, i.e. attribute information, using the transformed feature map of the hand region); and processing the object corresponding to the target region, according to the first attribute information (see WANG bottom of page 6/26, wherein the hand is calibrated using the hand key point detection).

As per claim 11, LIAO, in combination with Shao and WANG, discloses the method of claim 9, wherein the processing of the object corresponding to the target region comprises performing, on the object, any one or any combination of any two or more of object recognition, object segmentation, and object pose estimation (see WANG bottom of page 6/26, wherein the hand is recognized, segmented, and the pose is obtained).

As per claim 12, LIAO, in combination with Shao and WANG, discloses the method of claim 9, wherein the first attribute information comprises any one or any combination of any two or more of category information of the object, mask information of the object, key point information of the object, and pose information of the object (see WANG page 6/26 and more specifically page 7/26, wherein the hand key point, gesture key point, and the object shape mask are obtained).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over LIAO, in combination with Shao and WANG, in further view of Yong-duan SONG CN-110188689-A, hereinafter SONG.

As per claim 8, LIAO, in combination with Shao and WANG, fails to explicitly disclose where SONG teaches The method of claim 7, wherein the generating of the transformed first target region feature map by spatially transforming the first target region feature map comprises:generating a virtual camera corresponding to the target region, according to an imaging model of the first image and the detected target region (see SONG page 3/31, wherein extracting the virtual camera of the scene is disclosed); and generating the transformed first target region feature map by spatially transforming the first target region feature map with the virtual camera (see SONG page 3/31, wherein the conversion scale of the feature image is disclosed after the virtual camera is extracted in step a).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify LIAO’s, in combination with Shao and WANG, method by using SONG’s teaching by including the virtual camera to the transformed target region in order to acquire the best perspective for spatially transforming the image.

Claims 10 and 13-17 are rejected under 35 U.S.C. 103 as being unpatentable over LIAO, in combination with Shao and WANG, in further view of Heindl et al., hereinafter Heindl (included in applicant’s IDS submitted 05/04/2023).
As per claim 10, LIAO, in combination with Shao and WANG, fails to explicitly disclose where Heindl teaches the method of claim 9, further comprising: generating a second image associated with the first image (see Heindl bottom of page 3/8, wherein 2D human poses are estimated in rectilinear views from both fisheye cameras. see further page 4/8 3D Human Pose Reconstruction, wherein at least two images with projections of the same space is disclosed); and generating second attribute information of the object, based on the second image (see Heindl page 4/8 2D Human Pose Estimation, wherein anatomic key point locations are disclosed), and wherein the processing of the object corresponding to the target region according to the first attribute information comprises processing the object corresponding to the target region, according to the first attribute information and the second attribute information (see Heindl page 4/8 3D Human Pose Reconstruction and FIG. 6, wherein the two images are applied, or processed, in order to create a three-dimensional reconstruction of the body). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify LIAO, in combination with Shao and WANG, method by using Heindl’s teaching by including a second image associated with the first image in order to further acquire a second perspective of the scene for a more accurate calculation.

As per claim 13, LIAO, in combination with Shao, WANG, and Heindl, discloses the method of claim 10, wherein the first attribute information comprises first key point information of the object and initial pose information of the object, the second attribute information comprises second key point information of the object (see Heindl page 4/8 and FIG. 5, wherein the joints are predicted based off the initial position of the person for both fisheye images), and the processing of the object corresponding to the target region according to the first attribute information and the second attribute information comprises estimating final pose information of the object, based on the initial pose information, the first key point information, and the second key point information (see Heindl page 4/8 and FIG. 6, wherein the two fisheye images and their predicted joints are used in order to create the metric body model of the person in the image).

As per claim 14, LIAO, in combination with Shao, WANG, and Heindl, discloses The method of claim 13, wherein the generating of the second attribute information of the object based on the second image comprises: determining a target region corresponding to the object in the second image, based on the initial pose information (see Heindl page 4/8 and FIG. 5, wherein the human body joints and limbs are computed from the standing pose of the person),a parameter of a first camera generating the first image and a parameter of a second camera generating the second image (see Heindl page 3/8 Rectilinear View Generation, wherein the calculation formulas, i.e., parameters, to generate the rectilinear images from the fisheye cameras is disclosed. See also page 2/8); and generating the second key point information of the object, based on the target region corresponding to the object in the second image (see Heindl page 4/8 and FIG. 5, wherein the second fisheye image is used in order to generate the joints in the rectilinear input image).
As per claim 15, LIAO, in combination with Shao, WANG, and Heindl, discloses The method of claim 14, wherein the determining of the target region corresponding to the object in the second image, based on the initial pose information, the parameter of the first camera generating the first image, and the parameter of the second camera generating the second image, comprises: determining the initial pose information of the object in a coordinate system of the first camera, based on the initial pose information and the parameter of the first camera (see Heindl bottom of page 4/8, wherein the initial people positions in the fisheye images is disclosed. See prior page 2/8 and FIG. 2, wherein the camera parameters are disclosed);determining the initial pose information of the object in a coordinate system of the second camera, based on the initial pose information of the object in the coordinate system of the first camera and the parameter of the second camera (see Heindl bottom of page 4/8, wherein the initial people positions in the fisheye images is disclosed. This can also be done for the second fisheye camera as stated on the bottom of page 3/8. See prior page 2/8 and FIG. 2, wherein the camera parameters are disclosed); and determining the target region corresponding to the object in the second image according to the initial pose information of the object in the coordinate system of the second camera (see Heindl page 4/8 and FIG. 5, wherein the joint positions of the rectilinear input image from the second fisheye camera is disclosed. See prior page 3/8, wherein the coordinates for the rectilinear [ix, iy, 1]T view are acquired).

As per claim 16, LIAO, in combination with Shao, WANG, and Heindl, discloses The method of claim 14, wherein the generating of the second key point information of the object, based on the target region corresponding to the object in the second image, comprises: correcting the target region corresponding to the object in the second image (see WANG page 6/26, wherein the ROIAlign transforms the feature map by changing the area into a uniform size); and generating the second key point information of the object, based on the corrected target region in the second image (see WANG page bottom of 6/26, wherein the key points of the target are obtained after the transformation correction).

As per claim 17, LIAO, in combination with Shao, WANG, and Heindl, discloses The method of claim 16, wherein the correcting of the target region corresponding to the object in the second image comprises:generating a feature map of the second image (see WANG top of page 2/26, wherein a feature map is created);determining, to be a second target region feature map, a second feature region corresponding to the target region in the second image in the feature map of the second image (see WANG bottom of page 5/26, wherein the candidate frame is acquired from the feature extraction of the image);and generating a transformed second target region feature map by spatially transforming the second target region feature map (see WANG page 6/26, wherein the ROIAlign transforms the feature map by changing the area into a uniform size), and the generating of the second key point information of the object, based on the corrected target region in the second image comprises generating the second key point information of the object, based on the transformed second target region feature map (see WANG page bottom of 6/26, wherein the key points of the target are obtained after the transformation correction of the candidate frame of the feature map).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Bradley Obas Felix whose telephone number is (703)756-1314. The examiner can normally be reached M-F 8-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached at 5712728243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/BRADLEY O FELIX/Examiner, Art Unit 2671                                                                                                                                                                                                        

/VINCENT RUDOLPH/Supervisory Patent Examiner, Art Unit 2671
Read full office action
Prosecution Timeline

Nov 28, 2022
Application Filed
Apr 14, 2025
Non-Final Rejection — §103, §112
Jun 25, 2025
Response Filed
Sep 03, 2025
Final Rejection — §103, §112
Oct 09, 2025
Response after Non-Final Action
Jan 12, 2026
Request for Continued Examination
Jan 26, 2026
Response after Non-Final Action
Mar 31, 2026
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/729,277
Patent 12592076
OBJECT IDENTIFICATION SYSTEM AND METHOD
2y 5m to grant Granted Mar 31, 2026
17/774,868
Patent 12340540
AN IMAGING SENSOR, AN IMAGE PROCESSING DEVICE AND AN IMAGE PROCESSING METHOD
2y 5m to grant Granted Jun 24, 2025
Study what changed to get past this examiner. Based on 2 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
12%
Grant Probability
78%
With Interview (+66.7%)
3y 6m
Median Time to Grant
High
PTA Risk
Based on 17 resolved cases by this examiner. Grant probability derived from career allow rate.