Prosecution Insights
Last updated: April 19, 2026
Application No. 18/696,896

VIDEO SUPER-RESOLUTION METHOD AND DEVICE

Non-Final OA §101§102§103
Filed
Mar 28, 2024
Examiner
THOMAS, SOUMYA
Art Unit
2664
Tech Center
2600 — Communications
Assignee
BEIJING ZITIAO NETWORK TECHNOLOGY CO., LTD.
OA Round
1 (Non-Final)
100%
Grant Probability
Favorable
1-2
OA Rounds
2y 9m
To Grant
99%
With Interview

Examiner Intelligence

Grants 100% — above average
100%
Career Allow Rate
2 granted / 2 resolved
+38.0% vs TC avg
Minimal +0% lift
Without
With
+0.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
17 currently pending
Career history
19
Total Applications
across all art units

Statute-Specific Performance

§101
6.8%
-33.2% vs TC avg
§103
64.4%
+24.4% vs TC avg
§102
13.6%
-26.4% vs TC avg
§112
11.9%
-28.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 2 resolved cases

Office Action

§101 §102 §103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Drawings The drawings are objected to because ‘RBD’ in Fig.2 should read RDB (where ‘RDB’ stand for residual dense block). Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claim 11 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because Claim 11 is directed to a ‘computer-readable storage medium’. Under the broadest reasonable interpretation, this limitation could include transitory computer-readable storage medium, such as carrier waves (see MPEP § 2106.03). The examiner suggests rewriting Claim 11 to read ‘A non-transitory computer-readable storage medium’. Claim Rejections - 35 USC § 102 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claim(s) 1-3 and 6-8 are rejected under 35 U.S.C. 102(a)(1) and as being anticipated by Nah et al. (S. Nah, H. Dong, et al., "NTIRE 2019 Challenge on Video Super-Resolution: Methods and Results," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 2019), hereinafter Nah. As to Claim 1, Nah teaches a video super-resolution method, comprising (see pg. 1991, Section 4.7, “XJTU-IAIR team proposes a flow-guided spatio temporal dense network (FSTDN) for the joint video de blurring and super-resolution task as shown in Fig. 9.”, and see corresponding network shown in Fig. 9): acquiring a first feature, wherein the first feature is a feature obtained by merging an initial feature of a target video frame and an initial feature of each of neighborhood video frames of the target video frame (see Fig. 9, where the 5D tensor is the first feature, formed by extracting features from target frame L R t and the neighborhood of frames L R t + 1 ,   L R t + 2 ,   L R t - 1 , and L R t - 2 ). PNG media_image1.png 423 655 media_image1.png Greyscale (Fig. 9 of Nah) PNG media_image2.png 358 757 media_image2.png Greyscale (Fig. 2 of Instant Application) processing the first feature by concatenated multistage residual dense blocks (RDBs) (see Fig. 9, multiple residual dense blocks labeled 3D-RDB), PNG media_image3.png 423 510 media_image3.png Greyscale (Fig. 9 of Nah) PNG media_image4.png 362 680 media_image4.png Greyscale (Fig. 2 of Instant Application) to obtain a fusion feature output by a RDB in each stage (see Fig. 9, see features output from 3D-RDBs, labelled F 1 , F d , and F D ); PNG media_image5.png 423 510 media_image5.png Greyscale (Fig. 9 of Nah) PNG media_image6.png 381 665 media_image6.png Greyscale (Fig. 2 of Instant Application) for the fusion feature output by the RDB in each stage, aligning each of neighborhood features of the fusion feature with a target feature of the fusion feature to obtain an alignment feature corresponding to the RDB that outputs the fusion feature, (see Fig. 9, where F 1 W a r p , F d W a r p , and F D W a r p are all alignment features corresponding to their respective 3D-RDB blocks, and see Feature Warping Layer of Fig. 9, where the neighborhood of features comprising the ‘fusion feature’ F D are warped by to a target feature) PNG media_image7.png 423 358 media_image7.png Greyscale (Fig. 9 of Nah) PNG media_image8.png 389 770 media_image8.png Greyscale (Fig. 2 of Instant Application) PNG media_image9.png 357 881 media_image9.png Greyscale (Fig. 9 of Nah) PNG media_image10.png 536 976 media_image10.png Greyscale (Fig. 4 of Instant Application) wherein each of the neighborhood features of the fusion feature is a feature corresponding to each of the neighborhood video frames, and the target feature of the fusion feature is a feature corresponding to the target video frame (see Fig. 9, where the ‘fusion feature’ F D is split per frame, and the target feature is F d , t corresponds to a feature of the target frame , and the neighborhood features F d , t + 1 ,   F d , t + 2 ,   F d , t - 1 , F d , t - 2 correspond to L R t + 1 ,   L R t + 2 ,   L R t - 1 , L R t - 2 respectively), PNG media_image11.png 357 885 media_image11.png Greyscale (Fig. 9 of Nah) and generating a super-resolution video frame corresponding to the target video frame on the basis of the alignment feature corresponding to the RDB in each stage and the initial feature of the target video frame (see Fig 9., super resolution video frame   H R t , generated from the alignment features and the initial feature F t , which is connected by the red dotted arrow). PNG media_image12.png 453 1464 media_image12.png Greyscale (Fig. 9 of Nah) PNG media_image13.png 391 793 media_image13.png Greyscale (Fig. 2 of Instant Application) As to Claim 2, Nah teaches acquiring an optical flow between each of the neighborhood video frames and the target video frame respectively (see pg. 1991, Section 4.7, “XJTU-IAIR team proposes a flow-guided patio temporal dense network (FSTDN) for the joint video de blurring and super-resolution task as shown in Fig. 9.”, and see calculated flows F l o w t + 1 ,   F l o w t + 1 , F l o w t + 1 , F l o w t + 1 , which represent the optical flow between the target frame   L R t and each respective neighboring frame) PNG media_image14.png 357 815 media_image14.png Greyscale (Fig. 9 of Nah) PNG media_image15.png 493 716 media_image15.png Greyscale (Fig. 2 of Instant Application) and aligning each of neighborhood features of the fusion feature with a target feature of the fusion feature on the basis of the optical flow between each of the neighborhood video frames and the target video frame (see Fig. 9, ‘Feature Warping Layer’, where each feature fusion feature F d is warped (aligned) using the flow calculated from the neighboring frames and target frames), PNG media_image9.png 357 881 media_image9.png Greyscale (Fig. 9 of Nah) to obtain an alignment feature corresponding to the RDB that outputs the alignment feature (see F d W a r p is generated for its respective 3D-RBD D block). PNG media_image7.png 423 358 media_image7.png Greyscale (Fig. 9 of Nah) As to Claim 3, Nah teaches splitting the fusion feature to obtain each of the neighborhood features and the target feature (see Fig.9, ‘Feature Warping Layer’, where the ‘fusion feature F d is split to obtain target feature F d ,   t and the neighboring features F d , t + 1 ,   F d , t + 2 ,   F d , t - 1 , F d , t - 2 ), PNG media_image11.png 357 885 media_image11.png Greyscale (Fig. 9 of Nah) aligning each of the neighborhood features with the target feature on the basis of the optical flow between each of the neighborhood video frames and the target video frame, to obtain an alignment feature for each of the neighborhood video frames (see Fig. 9, ‘Feature Warping Layer’, where each feature fusion feature of F d   (   F d , t + 1 ,   F d , t + 2 ,   , F d , t ,   F d , t - 1 , F d , t - 2 ) is warped (or aligned) using the flow calculated from the neighboring frames and target frames); and merging the target feature and the alignment feature of each of the neighborhood video frames to obtain an alignment feature corresponding to the RDB that outputs the fusion feature (see Fig.9, where the warped features of the fusion features are concatenated to form F d W a r p ). PNG media_image16.png 357 850 media_image16.png Greyscale (Fig. 9 of Nah) As to Claim 6, Nah teaches that generating a super-resolution video frame corresponding to the target video frame on the basis of the alignment feature corresponding to the RDB in each stage and the initial feature of the target video frame, comprises: merging alignment features corresponding to the multistage RDBs to obtain a second feature (see Fig.9, the alignment features F 1 W a r p , F d W a r p , and F D W a r p being concatenated to form a 5D tensor), PNG media_image17.png 522 886 media_image17.png Greyscale (Fig. 9 of Nah) and converting, based on a feature conversion network, the second feature into a feature having the same tensor as an initial feature of the target video frame to obtain a third feature (see Fig. 9, ‘Temporal Fusion’, and see how the initial 5D Tensor (with dimensions n*(64*D)*5*h*w) is converted to a 4D tensor (with dimensions n*64*h*w). Additionally, see how the initial feature of the target frame is summed with the fourth feature, thus implying that the third feature is the same dimensions as the initial feature), PNG media_image18.png 485 1393 media_image18.png Greyscale (Fig. 9 of Nah) and generating a super-resolution video frame corresponding to the target video frame on the basis of the third feature and the initial feature of the target video frame (see Fig. 9, super resolution video frame   H R t , generated from the 4D tensor and the initial feature F t , which is connected by the red dotted arrow). PNG media_image19.png 443 1431 media_image19.png Greyscale (Fig. 9 of Nah) As to Claim 8, Nah teaches comprises: performing summation fusion on the third feature and the initial feature of the target video frame to obtain a fourth feature (see Fig.9, where the 4D tensor is the ‘third feature’ and the initial feature is added as indicated by the summation sign to obtain the fourth feature) PNG media_image20.png 485 1316 media_image20.png Greyscale (Fig. 9 of Nah) processing the fourth feature by a residual dense network RDN to obtain a fifth feature (see Fig 9, where the fourth feature is input into a 2D RDN to obtain the fifth feature); PNG media_image21.png 485 623 media_image21.png Greyscale (Fig. 9 of Nah) and upsampling the fifth feature to obtain a super-resolution video frame corresponding to the target video frame (see the upsampling block after the 2D RDN, which then outputs the super-resolution frame   H R t ). PNG media_image22.png 485 785 media_image22.png Greyscale (Fig. 9 of Nah) PNG media_image23.png 331 827 media_image23.png Greyscale (Fig. 4 of Instant Application) Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Nah et al. (S. Nah et al., "NTIRE 2019 Challenge on Video Super-Resolution: Methods and Results," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 2019), hereinafter Nah, in view of Gupta et al. (A. Gupta, et al., "Enhancing and experiencing spacetime resolution with videos and stills," 2009 IEEE International Conference on Computational Photography (ICCP)), hereinafter Gupta. As to Claim 4, Nah fails to explicitly teach upsampling the target video frame and each of the neighborhood video frames of the target video frame, to obtain an upsampled video frame of the target video frame and an upsampled video frame of each of the neighborhood video frames; acquiring an optical flow between the upsampled video frame of each of the neighborhood video frames and the upsampled video frame of the target video frame; and aligning each of the neighborhood features of the fusion feature with the target feature of the fusion feature on the basis of the optical flow between the upsampled video frame of each of the neighborhood video frames and the upsampled video frame of the target video frame, to obtain an alignment feature corresponding to the RDB that outputs the fusion feature. However, in an analogous art, Gupta teaches a method for enhancing the spacetime resolution of videos (see abstract on page 1), which includes upsampling adjacent video frames (see page 3, section 3.1, “The input consists of a stream of low-resolution frames with intermittent high-resolution stills. We upsample the low-resolution frames using bicubic interpolation to match the size of the high-resolution stills and denote them by fi. For each fi, the nearest two high-resolution stills are denoted as Sleft and Sright”), then calculating the flow between the upsampled frames (see page 3.1, “The system estimates motion between every fi and corresponding Sleft & Sright… One approach is to compute optical flow directly from the high-resolution stills, Sleft or Sright, to the upsampled frames fi”.) and then aligning the frames on the basis of optical flow between the upsampled video frames see page 3, section 3.1, “Once the system has computed correspondences from Sleft to fi and Sright to fi, it warps the high-resolution stills to bring them into alignment with fi”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the upsampling taught by Gupta with the super-resolution method taught by Nah. Gupta teaches on page3, section 3.1, “The summed motion estimation serves as initialization to bring long range motion within the operating range of the optical flow algorithm and reduces the errors accumulated from the pairwise sums.” Thus, it would have been obvious to combine the teachings of Gupta with the teachings of Nah in order to obtain the invention as claimed in Claim 4. Claims 7, 10, 13-14, and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Nah et al. (S. Nah et al., "NTIRE 2019 Challenge on Video Super-Resolution: Methods and Results," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 2019), hereinafter Nah in view of Hu et al. (CN 112565887), hereinafter Hu. As to Claim 7, Nah teaches, the feature conversion network comprises a first convolutional layer, a second convolutional layer, and a third convolutional layer concatenated sequentially; and the second convolutional layer and the third convolutional layer both have a kernel of 3*3*3 and have a padding parameter of 0 in a time dimension and a padding parameter of 1 in both length dimension and width dimension (see Fig.9, ‘Temporal Fusion ‘ block with three convolutional layers, and see kernel and padding labeled for the second and third convolutional layer, where ‘k’ stands for kernel, and ‘pad’ stands for padding). PNG media_image24.png 483 492 media_image24.png Greyscale (Fig.9 of Nah, with kernel and padding size) PNG media_image25.png 229 277 media_image25.png Greyscale (Fig.5 of instant application, with kernel and padding sizes) Nah fails to explicitly teach that the first convolutional layer has a kernel of 1 *1* 1 and has a padding parameter of 0 in each dimension. However, Hu teaches a super-resolution method, which includes a pointwise convolution kernel (see paragraph [0102], “This application introduces the depthwise separable convolution in the neural network model. The depthwise separable convolution uses different convolution kernels for each channel of the input image for operation and operation, and the operation steps can be divided into depthwise convolution (Depthwise) Convolution with point (Pointwise)”, and see paragraph [0104], “The convolution kernel of deep convolution is k×k, the channel is cd, and the convolution kernel of point convolution is 1 ×1”, where it is known in the art that a pointwise kernel has no padding). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the convolutional kernel taught by Hu with the super-resolution method taught by Nah. The motivation for doing so would be would be to reduce the amount of calculation needed (see paragraph [0104] and [0106], “Further, the depth separable convolution is to split the one-step convolution operation into two steps of deep convolution and point convolution…Compared with the standard convolution, the amount of calculation is reduced”). Thus, it would have been obvious to combine the kernel taught by Hu with the teachings of Nah in order to obtain the invention as claimed in Claim 7. As to Claim 10, Claim 10 is directed towards an electronic device, comprising a memory and a processor, the memory being configured to store a computer program, the processor being configured to, when executing the computer program, cause the electronic device to implement the same method as claimed in Claim 1. Nah teaches the video super-resolution method of Claim 1, but fails to explicitly teach an electronic device comprising a memory and a processor. However, Hu teaches a video super-resolution device (see paragraph [0001], “The embodiments of the present invention provide a video processing method, device, terminal, and storage medium, which can adaptively adjust a super-resolution strategy to perform super-resolution reconstruction on a video stream, thereby effectively improving video quality”), which comprises a memory and processor (see paragraph [0060], “In another aspect, an embodiment of the present invention provides an intelligent terminal, which includes a processor, a communication interface, and a memory”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the video super-resolution device taught by Hu with the video super resolution method that by Nah. The motivation for doing so would be to integrate the device into another system. Hu teaches in paragraph [0077], “The video processing system may be specifically integrated in an electronic device, and the electronic device may be a terminal or a server. For example, the video processing system can be integrated in the terminal. The terminal may be a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal computer (PC, Personal Computer), a TV, or other smart playback device, which is not limited in this application.” Thus, it would have been obvious to combine the video-super resolution device taught by Hu with the method taught by Nah in order to obtain the invention as claimed in Claim 10. As to Claim 11, Claim 11 is directed towards a computer-readable storage medium, the computer-readable storage medium storing a computer program which, when executed by a computing device, causing the computing device to implement the same method as claimed in Claim 1. Nah teaches the video super-resolution method of Claim 1, but fails to explicitly teach a computer-readable storage medium. However, Hu teaches a computer-readable storage medium (see paragraph [0001], “The embodiments of the present invention provide a video processing method, device, terminal, and storage medium, which can adaptively adjust a super-resolution strategy to perform super-resolution reconstruction on a video stream, thereby effectively improving video quality”), which can contain a computer program (see paragraph [0060], “The processor, the communication interface, and the memory are connected to each other, wherein the memory is used to store a computer program, The computer program includes program instructions, and the processor is configured to call the program instructions for performing operations involved in the foregoing video processing method”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the video processing device taught by Hu with the video processing method taught by Nah. The motivation for doing so would be to integrate the device into other electronic devices, as taught by Hu in paragraph [0077]. Thus, it would have been obvious to combine the video-super resolution device taught by Hu with the super-resolution method taught by Nah in order to obtain the invention as claimed in Claim 11. As to Claim 13, Claim 13 claims the same limitation as Claim 2 and is dependent on a similarly rejected independent claim. Therefore, the rejection and rationale are analogous to that made in Claim 2. As to Claim 14, Claim 14 claims the same limitation as Claim 3 and is dependent on a similarly rejected independent claim. Therefore, the rejection and rationale are analogous to that made in Claim 3. As to Claim 17, Claim 17 claims the same limitation as Claim 6 and is dependent on a similarly rejected independent claim. Therefore, the rejection and rationale are analogous to that made in Claim 6. As to Claim 18, Claim 18 claims the same limitation as Claim 7 and is dependent on a similarly rejected independent claim. Therefore, the rejection and rationale are analogous to that made in Claim 7. As to Claim 19, Claim 19 claims the same limitation as Claim 6 and is dependent on a similarly rejected independent claim. Therefore, the rejection and rationale are analogous to that made in Claim 8. Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Nah et al. (S. Nah et al., "NTIRE 2019 Challenge on Video Super-Resolution: Methods and Results,"2019), hereinafter Nah, in view of Gupta et al. (A. Gupta, et al., "Enhancing and experiencing spacetime resolution with videos and stills," 2009 IEEE International Conference on Computational Photography (ICCP)), hereinafter Gupta, and further in view of Hu et al. (CN 112565887), hereinafter Hu. As to Claim 4, Nah and Hu fail to explicitly teach upsampling the target video frame and each of the neighborhood video frames of the target video frame, to obtain an upsampled video frame of the target video frame and an upsampled video frame of each of the neighborhood video frames; acquiring an optical flow between the upsampled video frame of each of the neighborhood video frames and the upsampled video frame of the target video frame; and aligning each of the neighborhood features of the fusion feature with the target feature of the fusion feature on the basis of the optical flow between the upsampled video frame of each of the neighborhood video frames and the upsampled video frame of the target video frame, to obtain an alignment feature corresponding to the RDB that outputs the fusion feature. However, in an analogous art, Gupta teaches a method for enhancing the spacetime resolution of videos (see abstract on page 1), which includes upsampling adjacent video frames (see page 3, section 3.1, “The input consists of a stream of low-resolution frames with intermittent high-resolution stills. We upsample the low-resolution frames using bicubic interpolation to match the size of the high-resolution stills and denote them by fi. For each fi, the nearest two high-resolution stills are denoted as Sleft and Sright”), then calculating the flow between the upsampled frames (see page 3.1, “The system estimates motion between every fi and corresponding Sleft & Sright… One approach is to compute optical flow directly from the high-resolution stills, Sleft or Sright, to the upsampled frames fi”.) and then aligning the frames on the basis of optical flow between the upsampled video frames see page 3, section 3.1, “Once the system has computed correspondences from Sleft to fi and Sright to fi, it warps the high-resolution stills to bring them into alignment with fi”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the upsampling taught by Gupta with the super-resolution method taught by Nah and Hu. Gupta teaches on page3, section 3.1, “The summed motion estimation serves as initialization to bring long range motion within the operating range of the optical flow algorithm and reduces the errors accumulated from the pairwise sums.” Thus, it would have been obvious to combine the upsampling taught by Gupta with the teachings of Nah and Hu in order to obtain the invention as claimed in Claim 4. Allowable Subject Matter Claims 5 and 10 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Nah, Gupta, and Hu fail to teach: upsampling each of the neighborhood features and the target feature respectively, to obtain an upsampled feature of each of the neighborhood video frames and an upsampled feature of the target video frame; aligning the upsampled feature of each of the neighborhood video frames with the upsampled feature of the target video frame on the basis of the optical flow between the upsampled video frame of each of the neighborhood video frames and the upsampled video frame of the target video frame, to obtain an upsampled alignment feature of each of the neighborhood video frames; performing a space-to-depth conversion on the upsampled feature of the target video frame and the upsampled aligned feature of each of the neighborhood video frames respectively, to obtain an equivalent feature of the target video frame and an equivalent feature of each of the neighborhood video frames; and merging the equivalent feature of the target video frame and the equivalent feature of each of the neighborhood video frames, to obtain an alignment feature corresponding to the RDB that outputs the fusion features. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Porikli (US Pub No 2022/0222776) teaches a video super-resolution method comprising acquiring a first feature, processing the first feature with a network residual dense units, and then using the output of the residual to output a high-resolution frame. The frame is then aligned with previously processed frames in order, and then put into another network in order to generate a frame with higher resolution. Porikli fails to teach a ‘fusion feature’ comprising a target feature and multiple neighboring features. Hou (CN 113628115) teaches space-to-depth conversion of features for the purposes of super-resolution. However, Hou fails to explicitly teach upsampling each feature of the fusion feature, and to obtain an equivalent feature of the target video frame and an equivalent feature of each of the neighborhood video frames; and merging the equivalent feature of the target video frame and the equivalent feature of each of the neighborhood video frames. Wang et al. (CN 111583112), cited in the Chinese Search Report, teaches a method for video super-resolution which includes aligning video frames through deformable convolution. However, the alignment occurs before the frames are input into the residual dense network, and thus each feature produced by the RDB is not aligned. The same author published a paper (H. Wang, D. Su, C. Liu, L. Jin, X. Sun and X. Peng, "Deformable Non-Local Network for Video Super-Resolution," in IEEE Access, vol. 7, pp. 177734-177744, 2019), that teaches a similar architecture that also teaches aligning video frames before inputting the frames into a residual network. Dai et al. (CN 112767251), cited in the Chinese Search Report is directed towards a method of image super-resolution. Dai teaches extracting and fusing feature, but fails to teach aligning features. Du et al. (X. Du, Y. Zhou, Y. Chen, Y. Zhang, J. Yang and D. Jin, "Dense-Connected Residual Network for Video Super-Resolution," 2019 IEEE International Conference on Multimedia and Expo (ICME), Shanghai, China, 2019), teaches a residual network for video super-resolution, that uses optical flow to align video frames. However, the video frames are aligned before they are input into the residual network. Su et al. (D. Su, H. Wang, L. Jin, X. Sun and X. Peng, "Local-Global Fusion Network for Video Super-Resolution," in IEEE Access, vol. 8, pp. 172443-172456, 2020) teaches a video super-resolution method that uses residual blocks to extract features, and then aligns the features. However, Su fails to teach that the features are aligned to a target feature of a fusion feature. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SOUMYA THOMAS whose telephone number is (571)272-8639. The examiner can normally be reached M-F 8:30-5:00. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached at (571) 272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /S.T./Examiner, Art Unit 2664 /JENNIFER MEHMOOD/Supervisory Patent Examiner, Art Unit 2664
Read full office action

Prosecution Timeline

Mar 28, 2024
Application Filed
Mar 06, 2026
Non-Final Rejection — §101, §102, §103 (current)

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
100%
Grant Probability
99%
With Interview (+0.0%)
2y 9m
Median Time to Grant
Low
PTA Risk
Based on 2 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month