Last updated: April 19, 2026

Application No. 18/070,305

IMAGE PROCESSING METHOD AND APPARATUS, COMPUTER DEVICE, PROGRAM, AND STORAGE MEDIUM

Non-Final OA §103

Filed

Nov 28, 2022

Examiner

NAH, JONGBONG

Art Unit

2674

Tech Center

2600 — Communications

Assignee

Tencent Technology (Shenzhen) Company Limited

OA Round

3 (Non-Final)

Interview Optional

— +15.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 104 resolved cases, 2023–2026

Examiner Intelligence

NAH, JONGBONG View full profile →

Grants 75% — above average

Career Allow Rate

78 granted / 104 resolved

+13.0% vs TC avg

Strong +15% interview lift

Without

With

+15.2%

Interview Lift

resolved cases with interview

Typical timeline

2y 12m

Avg Prosecution

24 currently pending

Career history

128

Total Applications

across all art units

Statute-Specific Performance

§101

10.1%

-29.9% vs TC avg

§103

58.8%

+18.8% vs TC avg

§102

24.7%

-15.3% vs TC avg

§112

2.8%

-37.2% vs TC avg

Black line = Tech Center average estimate • Based on career data from 104 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 01/06/2025 has been entered.

Response to Arguments
In regards to Argument(s), Applicant(s) state(s) that, Tang et al does not disclose/teach/suggest on the amended claim(s) “each original image frame includ[es] multiple pixels," each confidence map includes many values, where each value represents "represents a confidence level of one respective pixel point" of an original image frame and "is indicative of whether the one respective pixel point is retained during image fusion" In other words, the confidence map is a pixel-level confidence map, therefore, the rejection of 35 U.S.C. 102 should be removed, (Emphasis added, Remarks, page 9-10).
Applicant’s arguments have been considered but are moot in view of the new ground(s) of rejection in view of Tang et al (CN 110070511 A; See translation provided by Examiner) in view of Varekamp et al (US 2018/0309974 A1).


Office Action Summary
Claim(s) 9 is/are cancelled.

Claim(s) 1-4, 7-8, 10-15, and 17-20  is/are rejected under 35 U.S.C. 103 as being unpatentable over Tang et al (CN 110070511 A; See translation provided by Examiner) in view of Varekamp et al (US 2018/0309974 A1).

Claim(s) 5-6 and 16 is/are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.


Claim Rejections - 35 USC § 103
	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1-4, 7-8, 10-15, and 17-20  is/are rejected under 35 U.S.C. 103 as being unpatentable over Tang et al (CN 110070511 A; See translation provided by Examiner) in view of Varekamp et al (US 2018/0309974 A1).

Regarding claim(s) 1, 12, and 18, Tang teaches a computing device, comprising: 
one or more processors (Figure 8: “processor 501”; and Paragraph [00276]); and 
memory storing one or more programs (Figure 8: “memory 502”; and Paragraph [00276]), the one or more programs comprising instructions that, when executed by the one or more processors (Figure 8: “processor 501”; and Paragraph [00276]), cause the one or more processors to perform operations comprising: 
acquiring an original image sequence, the original image sequence including a plurality of original image frames, each original image frame including multiple pixels (Paragraph [00109]: “Obtaining an image frame sequence, where the image frame sequence includes an image frame to be processed and one or more image frames adjacent to the image frame to be processed”); 
performing image preprocessing on the original image sequence by using an image preprocessing network comprising M confidence level blocks connected in series, M being a positive integer that is at least two (Paragraph [003]: “The process of video restoration often includes four steps: feature extraction, multi-frame alignment, multi-frame fusion and reconstruction”; Paragraphs [00238] – [00239] and Figure 6: “depicts a device structure including a Preprocessing module → Alignment module → Fusion module → Reconstruction module, confirming that the system is implemented as multiple serially connected blocks (M ≥ 2)”), the image preprocessing comprising performing serial processing on the original image sequence by using the M confidence level blocks to obtain a feature map sequence corresponding to the original image sequence and a confidence map sequence corresponding to the original image sequence (Paragraph [00138]; Paragraph [00141]: “The weight information of each of the aligned feature data may be separately determined by using a plurality of similarity features between the plurality of alignment feature data and the alignment feature data corresponding to the image frame to be processed, wherein the weight information may be represented by all the alignment features. The different importance of different frames in the data can be understood as determining the importance of different image frames according to the degree of similarity”; and Paragraph [0052] – [0054]: computing “a plurality of similarity features” between frames, using them to determine “weight information of each of the plurality of aligned feature data,” and then fusing these weighted features to form the final fused representation), wherein:
the feature map sequence is a sequence of feature maps obtained by performing feature extraction on all of the original image frames (Paragraph [00118]: “the image frame to be processed and the image frame in the image frame sequence may be image-aligned based on the first image feature set and the one or more second image feature sets to obtain multiple Aligning feature data, wherein the first image feature set includes at least one different scale feature data of the image frame to be processed, and the second image feature set includes at least one different scale feature of one of the image frame sequences data”; and Paragraph [00119]: “Specifically, for the image frame in the image frame sequence, the feature data of the image frame may be obtained after feature extraction. Further, feature data of different scales of the above image frames can be obtained to form an image feature set”); 
(Paragraph [00143]: “[…] higher the weight value indicates that the alignment feature data is more important in all frames, that is, needs to be retained, and the lower the weight value, indicates that the alignment feature data is less important in all frames. The performance is low, and the image frame to be processed may have an error, the occlusion element, or the aligning phase is not effective, and may be omitted”; and Paragraph [00189]: “ […] the preset threshold may be in the range of (0, 1). For example, the alignment feature data whose weight value is less than the preset threshold may be ignored. And retaining the alignment feature data whose weight value is greater than the preset threshold […] the importance degree of the above-mentioned alignment feature data is filtered and represented, which facilitates rationalized multi-frame fusion and reconstruction”);
performing the feature fusion on the feature map sequence based on the confidence map sequence, to obtain a target fused feature map corresponding to a target original image frame in the original image sequence (Paragraph [00239]: “The merging module 320 is further configured to combine the plurality of aligning feature data according to the weight information of each of the aligning feature data to obtain the fused information of the image frame sequence, where the fused information is used to obtain the image frame corresponding to the image to be processed”); and 
reconstructing the target original image frame based on the target fused feature map to obtain a target reconstructed image frame (Paragraph [00154]: “The fusion information of the image frame sequence can be obtained by the above method, and then the image reconstruction can be performed according to the fusion information, and the processed image frame corresponding to the image frame to be processed is obtained, and a high quality frame can usually be recovered to realize image restoration. Optionally, the foregoing image processing may be performed on the plurality of to-be processed image frames, and the processed image frame sequence is obtained, where the plurality of the processed image frames are included, that is, the video data may be formed to achieve the effect of video restoration”).
Tang fails to teach the confidence map sequence comprising a plurality of confidence maps corresponding to the plurality of original image frames, each confidence map corresponds to one respective original image frame and includes multiple values corresponding to the multiple pixels of the one respective image frame, each value represents a confidence level of one respective pixel point
However, Varekamp teaches the confidence map sequence comprising a plurality of confidence maps corresponding to the plurality of original image frames, each confidence map corresponds to one respective original image frame and includes multiple values corresponding to the multiple pixels of the one respective image frame, each value represents a confidence level (read as “binary, non-binary, and soft decision”) of one respective pixel point(Paragraph [0048]: “providing a confidence map comprising confidence values for pixels of the depth map, the confidence value for a pixel designating the pixel as a confident pixel or a non-confident pixel”; Paragraph [0019] – Paragraph [0020]: “the confidence values may be binary values denoting the corresponding pixels as a confident or non-confident pixel […] In some embodiments, the confidence values may be non-binary values indicating an estimated reliability”; and Paragraph [0072]: “a soft-decision confidence estimate value may be stored for each pixel. However, such a non-binary confidence value will still reflect whether the pixel is considered a confident pixel or a non-confident pixel. Indeed, the process may consider all pixels for which a non-binary confidence value is above a threshold as a confident pixel, and pixels for which the value is below (or equal to) the threshold as non-confident pixels”).
It would have been obvious to a person of ordinary skill in the art at the time of the invention to incorporate Varekamp’s per-pixel confidence values into Tang’s multi-frame image processing and feature fusion framework in order to explicitly represent pixel-level confidence information for use during feature fusion, since Tang already teaches retaining or ignoring feature data during fusion based on per element values and Varekamp teaches assigning confidence values to individual pixels to indicate their reliability. The combination merely represents a predictable use of prior art elements according to their established functions and results in performing image preprocessing to obtain a feature map sequence and a confidence map sequence, performing feature fusion based on the confidence map sequence to obtain a fused feature map, and reconstructing a target image frame.  This motivation for the combination of Tang and Varekamp is supported by KSR exemplary rationale (G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention. MPEP 2141 (III).

Regarding claim(s) 2, 13, and 19, Tang as modified by Varekamp teaches the method according to claim 1, wherein performing the feature fusion on the feature map sequence based on the confidence map sequence comprises: 
Where Varekamp teaches determining, from the confidence map sequence, a target confidence map corresponding to the target original image frame (Paragraph [0048]: “providing a confidence map comprising confidence values for pixels of the depth map, the confidence value for a pixel designating the pixel as a confident pixel or a non-confident pixel”); 
where Tang teaches determining, from the feature map sequence, a target feature map corresponding to the target original image frame (Paragraph [00118]: “the image frame to be processed and the image frame in the image frame sequence may be image-aligned based on the first image feature set and the one or more second image feature sets to obtain multiple Aligning feature data, wherein the first image feature set includes at least one different scale feature data of the image frame to be processed, and the second image feature set includes at least one different scale feature of one of the image frame sequences data”; and Paragraph [00119]: “Specifically, for the image frame in the image frame sequence, the feature data of the image frame may be obtained after feature extraction. Further, feature data of different scales of the above image frames can be obtained to form an image feature set”); 
determining a first fused feature map based on the target confidence map and the target feature map (Paragraph [00143]: “[…] higher the weight value indicates that the alignment feature data is more important in all frames, that is, needs to be retained, and the lower the weight value, indicates that the alignment feature data is less important in all frames. The performance is low, and the image frame to be processed may have an error, the occlusion element, or the aligning phase is not effective, and may be omitted”; and Paragraph [00239]: “The merging module 320 is further configured to combine the plurality of aligning feature data according to the weight information of each of the aligning feature data to obtain the fused information of the image frame sequence, where the fused information is used to obtain the image frame corresponding to the image to be processed”); 
performing feature fusion on the feature map sequence based on the target confidence map to obtain a second fused feature map (Paragraph [00128]: “Acquiring the first feature data with the smallest size of the first image feature set, and the second feature data of the second image feature set having the same size as the first feature data, and performing the first feature data and the second feature data Aligning images to obtain first alignment feature data”; and Paragraph [00239]: “The merging module 320 is further configured to combine the plurality of aligning feature data according to the weight information of each of the aligning feature data to obtain the fused information of the image frame sequence, where the fused information is used to obtain the image frame corresponding to the image to be processed”); and 
performing feature fusion on the first fused feature map and the second fused feature map to obtain the target fused feature map (Paragraph [00241]: “performing image alignment on the image frame to be processed and the image frame in the image frame sequence based on the first image feature set and the one or more second image feature sets to obtain a plurality of alignment feature data, wherein the first image feature”).

Regarding claim(s) 3 and 14, Tang as modified by Varekamp teaches the method according to claim 2, wherein: 
where Varekamp teaches determining the first fused feature map based on the target confidence map and the target feature map (Paragraph [0048]; Paragraph [0019] – Paragraph [0020]; and Paragraph [0072]) comprises: 
where Tang teaches multiplying confidence levels of pixel points in the target confidence map by feature values of the corresponding pixel points in the target feature map respectively, to obtain the first fused feature map (Paragraph [00140]: “In an optional implementation manner, the alignment feature data corresponding to the image frame to be processed may be determined by dot-multiplying each of the alignment feature data and the alignment feature data corresponding to the image frame to be processed. Multiple similarity features between”; and Paragraph [00150]: “In an optional implementation manner, each of the aligned feature data may be multiplied by the weight information of each of the aligned feature data by element-level multiplication to obtain a plurality of modulated feature data of the plurality of aligned feature data”).
Regarding claim(s) 4 and 15, Tang as modified by Varekamp teaches the method according to claim 2, where Tang teaches wherein performing feature fusion on the first fused feature map and the second fused feature map to obtain the target fused feature map comprises: 
adding feature values in the first fused feature map and feature values at the corresponding positions in the second fused feature map, to obtain the target fused feature map (Paragraph [00206]: “Specifically, according to the spatial attention information of each element point in the spatial feature data, each element point in the spatial feature data is modulated correspondingly by element-wise multiplication and addition, thereby obtaining the above Modulated fusion information”).

Regarding claim(s) 7, Tang as modified by Varekamp teaches the method according to claim 1, where Tang teaches wherein performing the feature fusion on the feature map sequence based on the confidence map sequence, to obtain a target fused feature map corresponding to a target original image frame in the original image sequence comprises: 
performing the feature fusion on the feature map sequence based on the confidence map sequence by using a feature fusion network, to obtain the target fused feature map corresponding to the target original image frame in the original image sequence (Paragraph [00149]: “the fusion convolution network may be used to combine the plurality of alignment feature data according to the weight information of each of the alignment feature data to obtain the fusion information of the image frame sequence”).

Regarding claim(s) 8, Tang as modified by Varekamp teaches the method according to claim 1, where Tang teaches wherein reconstructing the target original image frame based on the target fused feature map to obtain a target reconstructed image frame comprises: 
reconstructing the target original image frame based on the target fused feature map by using a reconstruction network, to obtain the target reconstructed image frame (Figure 5; Paragraph [00154]: “the image reconstruction can be performed according to the fusion information, and the processed image frame corresponding to the image frame to be processed is obtained, and a high quality frame can usually be recovered to realize image restoration”; and Paragraph [00225]: “The method is to perform multi-frame alignment and fusion with adjacent frames, and finally obtain the fusion information, and then input the reconstruction module to obtain the processed image frame according to the fusion information”).

Regarding claim(s) 10 and 17, Tang as modified by Varekamp teaches the method according to claim 1, where Tang teaches wherein: 
extracting at least one group of original image sequences from an original video, target original image frames in different original image sequences being corresponding to different timestamps in the original video (Paragraph [00116]: “The adjacent frames may be continuous or spaced. If the image frame to be processed is recorded as t, the adjacent frames may be recorded as t - i or t + i. For example, in a sequence of image frames of a video data, the image frames adjacent to the image frame to be processed may be the previous frame and/or the next frame of the image frame to be processed, or may be the image to be processed from the image to be processed”).

Regarding claim(s) 11 and 20, Tang as modified by Varekamp teaches the method of claim 1, where Tang teaches further comprising: 
after reconstructing the target original image frame based on the target fused feature map to obtain the target reconstructed image frame, generating a target video based on a plurality of target reconstructed image frames corresponding to all of the original image sequences and timestamps of the target original image frames corresponding to all of the target reconstructed image frames (Figure 5; and Paragraph [00225]: “the adjacent three frames of t-1, t, and t+1 are input as an input, and the image processing in the embodiment of the present application is performed by performing deblurring processing with the deblurring module, and sequentially inputting the PCD alignment module and the TSA fusion module. The method is to perform multi-frame alignment and fusion with adjacent frames, and finally obtain the fusion information, and then input the reconstruction module to obtain the processed image frame according to the fusion information”).

Allowable Subject Matter
Claim(s) 5-6 and 16 is/are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Relevant Prior Art Directed to State of Art
	Benou et al (US 12,086,995 B2) are relevant prior art not applied in the rejection(s) above. Benou discloses a system for performing video background estimation, the system comprising: memory to store an input video picture; computer readable instructions; and at least one processor circuit to be programmed by the computer readable instructions to: generate a first estimated background picture and a corresponding first confidence map for the input video picture based on a temporal model; generate a second estimated background picture and a corresponding second confidence map for the input video picture based on a spatial model; select a fusion technique from among a plurality of available fusion techniques to combine the first estimated background picture and the second estimated background picture, selection of the fusion technique based on a magnitude difference between a confidence value corresponding to the first confidence map and a confidence value corresponding to the second confidence map; and combine the first estimated background picture and the second estimated background picture based on the selected fusion technique to generate a resultant estimated background picture for the input video picture.

Yang et al (US 2022/0284552 A1) are relevant prior art not applied in the rejection(s) above. Yang discloses a method for inpainting a sequence of initial image frames, the method comprising: receiving initial video data representing a sequence of initial image frames; generating optical flow displacement values between neighboring image frames of the sequence of initial image frames; warp-shifting image features from image feature maps of one or more neighboring image frames to image feature maps of a current image frame using the optical flow displacement values; and generating a sequence of complete image frames based on warp-shifted image features from the feature maps of the one or more neighboring image frames and image features from the image feature maps of the current image frame, the sequence of complete image frames including an inpainted version of the sequence of initial image frames.

Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONGBONG NAH whose telephone number is (571) 272-1361. The examiner can normally be reached M - F: 9:00 AM - 5:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ONEAL MISTRY can be reached on 313-446-4912. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JONGBONG NAH/Examiner, Art Unit 2674                                                                                                                                                                                                        


                                                                                                                                                                                                        
/ONEAL R MISTRY/Supervisory Patent Examiner, Art Unit 2674

Read full office action

Prosecution Timeline

Nov 28, 2022

Application Filed

May 08, 2025

Non-Final Rejection — §103

Aug 12, 2025

Response Filed

Oct 30, 2025

Final Rejection — §103

Dec 23, 2025

Examiner Interview Summary

Dec 23, 2025

Applicant Interview (Telephonic)

Jan 05, 2026

Response after Non-Final Action

Jan 28, 2026

Request for Continued Examination

Jan 30, 2026

Response after Non-Final Action

Feb 03, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/939,826

Patent 12591952

Image rotation

2y 5m to grant Granted Mar 31, 2026

17/649,734

Patent 12579737

ROTATING 3D SCANNER TO ENABLE PRONE CONTACTLESS REGISTRATION

2y 5m to grant Granted Mar 17, 2026

18/337,510

Patent 12579775

IMAGE PROCESSING USING DATUM IDENTIFICATION AND MACHINE LEARNING ALGORITHMS

2y 5m to grant Granted Mar 17, 2026

18/748,479

Patent 12580050

SPATIALLY CO-REGISTERED GENOMIC AND IMAGING (SCORGI) DATA ELEMENTS FOR FINGERPRINTING MICRODOMAINS

2y 5m to grant Granted Mar 17, 2026

17/840,184

Patent 12567141

MEDICAL IMAGE SYNTHESIS DEVICE AND METHOD

2y 5m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

75%

Grant Probability

90%

With Interview (+15.2%)

2y 12m

Median Time to Grant

High

PTA Risk

Based on 104 resolved cases by this examiner. Grant probability derived from career allow rate.

IMAGE PROCESSING METHOD AND APPARATUS, COMPUTER DEVICE, PROGRAM, AND STORAGE MEDIUM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email