DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claim(s) 34 – 53 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
On pages 7-9, applicant argues that Shi does not teach selecting a picture resolution by comparing the first similarity metric to at least one threshold and encoding with a resolution of the first source picture when the threshold is not satisfied and encoding with a resolution of the first reduced resolution when the threshold his satisfied as claimed because Shi teaches relating distortion values to a constant rate factor and identifying a constant rate factor transition threshold at corresponding positions on the distortion-threshold curve which is not the same as the claimed comparison of a similarity metric to a threshold and then selecting a picture resolution based at least in part on the first similarity metric. While applicant’s arguments are understood, examiner respectfully disagrees. Examiner relies on a combination of Kotra and Shi in maintaining the rejection.
One cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981). Rather, “the test for obviousness is what the combined teachings of the references would have suggested to [a POSITA]." In re Mouttet, 686 F.3d 1322, 1333, 103 USPQ2d 1219, 1226 (Fed. Cir. 2012). At present, the combined teachings of Kotra and Shi reasonably suggest to a person of ordinary skill in the art selecting a picture resolution by comparing the first similarity metric to at least one threshold and encoding with a resolution of the first source picture when the threshold is not satisfied and encoding with a resolution of the first reduced resolution when the threshold his satisfied as claimed.
First, Kotra teaches determining a first similarity metric between a first reduced resolution picture and the first source picture. See, e.g. section 1: describing that the system determines a peak signal noise ratio (PSNR) between the down sampled picture and the original picture. Next, Kotra teaches determining whether to encode the first source picture at a reduced resolution or a normal resolution based on the similarity metric. See, e.g. section 4: describing that the encoder determines whether to encode a picture at a downsampled resolution or a normal resolution based on a rate distortion (RD), wherein the RD is the equivalent of the similarity metric, and wherein the normal resolution is the equivalent of the resolution of the source picture. Kotra further teaches that the determination is made based on a comparison of the similarity metric. See, e.g. section 4: describing that the system determines whether to encode a picture at the downsampled resolution or the normal resolution based on a comparison of the RD. Kotra does not explicitly teach wherein the system selects a picture resolution by comparing the first similarity metric to at least one threshold, the system encoding the picture with the reduced resolution when the threshold is satisfied and encoding the picture with the source resolution when the threshold his not satisfied. Shi, however, teaches a system and method for determining whether to encode a picture at a reduced resolution or a normal resolution. See, e.g. Fig. 2B and col 11, line 6 – col 12, line 28: depicting and describing that the system determines whether to encode a picture using a normal resolution or a decreased resolution, and encodes the picture at the selected resolution. Shi further teaches that this determination is made by comparing a quality metric with a threshold, the system encoding the picture at a reduced resolution when the threshold is satisfied and encoding the picture at a normal resolution when the threshold is not satisfied. See, e.g. Fig. 2B and col 11, line 6 – col 12, line 28: depicting and describing that the system determines whether to encode a picture at a reduced resolution by comparing a constant rate factor with a constant rate factor threshold, the system encoding the picture at a normal resolution when the constant rate factor is below the threshold and encoding the picture at a reduced resolution when the constant rate factor is equal to or greater than the threshold, wherein the constant rate factor is the equivalent of the quality metric [see, e.g. col 6, lines 4 – 29: describing that the constant rate factor is a quality setting of the encoder, the quality setting based on a combination of encoding parameters and resolution]. Shi next teaches that both the quality metric and the threshold have an associated relationship with downsampling distortion, the threshold being set based on that relationship. See, e.g. col 14, line 55 - col 15, line 3: describing that the system utilizes the relationship between downsampling distortion and constant rate factor to determine a constant rate factor threshold, the constant rate factor threshold corresponding to a plurality of downsampling distortion values, wherein the constant rate factor is the equivalent of the quality metric. In other words, Shi teaches using downsampling distortion to determine whether a picture should be coded at a normal resolution or a downsampled resolution by comparing a quality metric to a threshold, both the quality metric and the threshold associated with a downsampling distortion value. The combined teachings of Kotra and Shi therefore reasonably suggest to a person of ordinary skill in the art selecting a picture resolution by comparing the first similarity metric to at least one threshold and encoding with a resolution of the first source picture when the threshold is not satisfied and encoding with a resolution of the first reduced resolution when the threshold his satisfied as claimed. One of ordinary skill in the art would have been motivated to make such a combination because the combination improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Examiner Remarks
Claims are interpreted in the alternative only.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 34, 35, 38 – 39, 42, 43, 45, and 47 - 53 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kotra et al., “AHG11: Neural Network-based Super Resolution”, 21st Meeting of Joint Video Experts Team (JVET) of ITU-T SG WP 3 and ISO-IEC JTC 1/SC 29, No. JVET-U0099-v2, 6-15 January 2021 (hereinafter Kotra), as cited by applicant, in view of Shi et al. (US 11,553,188) (hereinafter Shi).
Regarding claims 34 and 53, Kotra teaches a method for determining resolution, and a computer program product comprising a non-transitory computer readable medium storing instructions which when performed by processing circuitry of a device causes the device to perform the method, the method comprising:
obtaining a first source picture (e.g., section 1: describing that the system obtains a picture);
generating a first reduced resolution picture based on the first source picture (e.g., section 1: describing that the picture is down-sampled, wherein down sampling a picture is the equivalent of generating the first reduced resolution picture of the picture);
determining a first similarity metric for the first reduced resolution picture and the first source picture, (e.g., section 1: describing that the system determines the peak signal noise ratio (PSNR) between the down sampled picture and the original picture);
wherein determining the first similarity metric comprises:
(i) upscaling the first reduced resolution picture to the resolution of the first source picture to generate an up-scaled picture, (e.g. section 1: describing that the system upsamples the down sampled picture) and
(ii) comparing the up-scaled picture to the first source picture (e.g. section 1: describing that the system compares the upsampled picture with the source picture);
selecting a picture resolution based at least in part on the first similarity metric (e.g. section 4: describing that the system determines whether to encode a picture at a downsampled resolution or a normal resolution based on rate distortion cost, wherein rate distortion cost is the equivalent of the first similarity metric);
performing an encoding operation with the selected picture resolution (e.g. section 4: describing that the system codes a given picture at either the downsampled resolution or the normal resolution according to the determination)
wherein:
selecting a picture resolution comprises comparing the first similarity metric (e.g. section 4: describing that the system determines whether to encode a picture at a downsampled resolution or at a normal resolution based on a comparison of the rate distortion, wherein the rate distortion is the equivalent of the first similarity metric)
generating the first reduced resolution picture comprises downscaling the first source picture by applying a rescaling filter (e.g. introduction: describing that the system downsamples and upsamples the picture using an interpolation filter, wherein the interpolation filter is the equivalent of the rescaling filter);
applying a re-scaling filter comprises applying an interpolation filter to one or more luma or chroma components of the first source picture (e.g. introduction: describing that the system downsamples and upsamples the picture using an interpolation filter); and
the first source picture is part of a set of pictures, and wherein the selected resolution is used for the entire set of pictures during the encoding operation (e.g. introduction: describing that the picture is a part of a sequence of pictures, the entire sequence being encoded using the same resolution).
Kotra does not explicitly teach:
wherein:
selecting a picture resolution by comparing the first similarity metric comprises selecting a picture resolution by comparing the first similarity metric to at least one threshold;
the encoding operation comprises encoding with a resolution of the first source picture when the threshold is not satisfied, and encoding with a resolution of the first reduced resolution picture when the threshold is satisfied.
Shi, however, teaches a computer program product and a method for determining resolution:
wherein:
selecting a picture resolution by comparing the first similarity metric comprises selecting a picture resolution by comparing the first similarity metric to at least one threshold (e.g. Figs. 2A and 2B, and col 10, line 42 – col 11, line 51: depicting and describing that when determining whether a picture should be encoded using the original resolution or a reduced resolution, the system compares the distortion with a constant rate factor threshold); and
the encoding operation comprises encoding with a resolution of the first source picture when the threshold is not satisfied, and encoding with a resolution of the first reduced resolution picture when the threshold is satisfied (e.g. Fig. 2B and col 3, lines 38 – 55 and col 10, line 42 – col 11, line 51: depicting and describing that when the threshold is not satisfied, the system encodes the video at the original resolution and when the threshold is satisfied, the system encodes the video at the reduced resolution).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Shi in order to select a picture resolution based at least in part on the first similarity metric, perform an encoding operation with the selected picture resolution, wherein selecting a picture resolution comprises comparing the first similarity metric to at least one threshold, and wherein the encoding operation comprises encoding with a resolution of the first source picture when the threshold is not satisfied, and encoding with a resolution of the first reduced resolution picture when the threshold is satisfied. One of ordinary skill in the art would have been motivated to make such a modification because the modification improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Turning to claim 35, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra further teaches:
wherein performing an encoding operation comprises applying reference picture resampling (RPR) to a Versatile Video Coding (VVC) video segment (e.g. Abstract and introduction : describing that the system applies Versatile Video Coding (VVC) reference picture resampling (RPR)).
Regarding claim 38, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra does not explicitly teach:
wherein the threshold is set based at least in part on:(i) use of noise reduction in the first source picture; (ii) encoding bit depth; or (iii) quality level of a picture.
Shi, however, teaches a method for determining resolution:
wherein the threshold is set based at least in part on:(i) use of noise reduction in the first source picture; (ii) encoding bit depth; or (iii) quality level of a picture (e.g. col 3, lines 21 – 55 and col 6, lines 4 – 55: describing that the distortion between the original picture and the down sampled then upsampled picture is compared to a constate rate factor threshold, the constant rate factor threshold being a quality level of a picture).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Shi in order for the threshold is set based at least in part on:(i) use of noise reduction in the first source picture; (ii) encoding bit depth; or (iii) quality level of a picture. One of ordinary skill in the art would have been motivated to make such a modification because the modification improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Turning to claim 39, Kotra and Shi teach all of the limitations of claims 34 and 38, as discussed above. Kotra does not explicitly teach:
wherein the quality level of a picture is a quantization parameter (QP).
Shi, however, teaches a method for determining resolution:
wherein the quality level of a picture is a quantization parameter (QP) (e.g. col 6, lines 4 – 29: describing that the constant rate factor is a quantization parameter).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Shi in order for the quality level of a picture is a quantization parameter (QP). One of ordinary skill in the art would have been motivated to make such a modification because the modification improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Regarding claim 42, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra does not explicitly teach:
wherein the threshold is variable and is determined based on a mapping between a quality level of a picture and the similarity metric.
Shi, however, teaches a method for determining resolution:
wherein the threshold is variable and is determined based on a mapping between a quality level of a picture and the similarity metric (e.g. Fig. 5B and col 14, line 55 – col 15, line 3: depicting and describing that threshold is variable and determined based on a relationship between picture quality [constant rate factor] and the downsampling distortion, wherein the downsampling distortion is the equivalent of the similarity metric).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Shi in order for the threshold is variable and is determined based on a mapping between a quality level of a picture and the similarity metric. One of ordinary skill in the art would have been motivated to make such a modification because the modification improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Regarding claim 43, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra further teaches:
determining a downsampling distortion, the downsampling distortion being a peak signal to noise ratio (PSNR) between the up-scaled picture and the first source picture (e.g. section 1: describing that the system determines a PSNR between the original picture and the up-sampled picture).
Kotra does not explicitly teach:
wherein one or more of the selected resolution or the threshold is based at least in part on a quantization parameter (QP) and a peak signal to noise ratio (PSNR) between the up-scaled picture and the first source picture.
Shi, however, teaches a method for determining resolution:
wherein one or more of the selected resolution or the threshold is based at least in part on a quantization parameter (QP) and a peak signal to noise ratio (PSNR) between the up-scaled picture and the first source picture (e.g. Figs. 2A, 2B and 5B, and col 10, line 1 – col 11, line 51: depicting and describing that the system selects the resolution and the threshold based on a constant rate factor and a downsampling distortion between the up-scaled picture and the original picture, wherein the constant rate factor is the equivalent of the quantization parameter [see, e.g. col 6, lines 4 – 29: describing that the constant rate factor is based at least in part on a quantization parameter], and wherein the downsampling distortion is the equivalent of the PSNR).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Shi in order for one or more of the selected resolution or the threshold is based at least in part on a quantization parameter (QP) and a peak signal to noise ratio (PSNR) between the up-scaled picture and the first source picture. One of ordinary skill in the art would have been motivated to make such a modification because the modification improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Turning to claim 45, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra does not explicitly teach:
transmitting the encoded picture or transmitting a resolution indication from an encoder to a decoder.
Shi, however, teaches a method for determining a resolution:
transmitting the encoded picture or transmitting a resolution indication from an encoder to a decoder (e.g. col 7, lines 35 – 49: describing that the system transmits the encoded digital video).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Shi in order to transmit the encoded picture. One of ordinary skill in the art would have been motivated to make such a modification because the modification improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Regarding claim 47, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra further teaches:
performing an additional encoding operation, wherein the first source picture is encoded at the source resolution to generate a first encoded picture (e.g. section 4: describing that the system encodes the original picture at the source resolution),
wherein:
the first reduced resolution picture is generated by encoding the first source picture at a reduced resolution (e.g. Abstract and sections 1 and 4: describing that the system encodes the down sampled picture of the original picture, wherein the down sampled picture is the equivalent of the first source picture at a reduced resolution);
the first similarity metric is determined based on a comparison of the first encoded picture and the encoded first reduced resolution picture (e.g. section 4: describing that the system compares the coded picture at the original resolution with the coded picture at a down sampled resolution); and
the first similarity metric is based on one or more of bit rate or distortion (e.g. section 4: describing that the system determines a rate distortion (RD)).
Turning to claim 48, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra does not explicitly teach:
wherein selecting the picture resolution comprises retrieving a compression value based at least in part on the first similarity metric.
Shi, however, teaches a method for determining a resolution:
wherein selecting the picture resolution comprises retrieving a compression value based at least in part on the first similarity metric (e.g. Fig. 5B, and col 14, line 55 – col 15, line 3: describing that the system retrieves a constant rate factor threshold based on the downsampling distortion, the constant rate factor threshold including a quantization value [see, col 6, lines 4 – 55: describing that the constant rate factor threshold includes a quantization parameter], wherein the downsampling distortion is the equivalent of the first similarity metric).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Shi in order to select the picture resolution comprises retrieving a compression value based at least in part on the first similarity metric. One of ordinary skill in the art would have been motivated to make such a modification because the modification improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Regarding claim 49, Kotra and Shi teach all of the limitations of claims 34 and 48, as discussed above. Kotra does not explicitly teach:
wherein the compression value is one or more of a quantization parameter (QP) value; a QP threshold; or a QP delta.
Shi, however, teaches a method for determining a resolution:
wherein the compression value is one or more of a quantization parameter (QP) value; a QP threshold; or a QP delta (e.g. Fig. 5B, and col 14, line 55 – col 15, line 3: describing that the system retrieves a constant rate factor threshold based on the downsampling distortion, the constant rate factor threshold including a quantization value [see, col 6, lines 4 – 55: describing that the constant rate factor threshold includes a quantization parameter]).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Shi in order for the compression value is one or more of a quantization parameter (QP) value; a QP threshold; or a QP delta. One of ordinary skill in the art would have been motivated to make such a modification because the modification improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Turning to claim 50, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra further teaches:
wherein the first reduced resolution picture has a resolution of two-thirds or half of the resolution of the source picture (e.g. Abstract: describing that the original picture is down sampled by a factor of 2, wherein downsampling by a factor of 2 is the equivalent of a resolution that is half the resolution of the source picture).
Regarding claim 51, Kotra teaches an apparatus, wherein the apparatus comprises processing circuitry and a memory containing instructions executable by the processing circuitry, whereby the apparatus is configured to:
obtain a first source picture (section 1: describing that the system obtains a picture);
generate a first reduced resolution picture based on the first source picture (e.g., section 1: describing that the picture is down-sampled, wherein down sampling a picture is the equivalent of generating the first reduced resolution picture of the picture); and
determine a first similarity metric for the first reduced resolution picture and the first source picture (e.g., section 1: describing that the system determines the peak signal noise ratio (PSNR) between the down sampled picture and the original picture),
wherein determining the first similarity metric comprises:
(i) upscaling the first reduced resolution picture to the resolution of the first source picture to generate an up-scaled picture (e.g. section 1: describing that the system upsamples the down sampled picture), and
(ii) comparing the up-scaled picture to the first source picture (NPL, e.g. section 1: describing that the system compares the upsampled picture with the source picture)
select a picture resolution based at least in part on the first similarity metric (e.g. section 4: describing that the system determines whether to encode a picture at a downsampled resolution or a normal resolution based on rate distortion cost, wherein rate distortion cost is the equivalent of the first similarity metric), and
perform an encoding operation with the selected picture resolution (e.g. section 4: describing that the system codes a given picture at either the downsampled resolution or the normal resolution according to the determination),
wherein selecting a picture resolution comprises comparing the first similarity metric (e.g. section 4: describing that the system determines to code a picture at either the downsampled resolution or the normal resolution based on a comparison of the rate distortion, wherein the rate distortion is the equivalent of the first similarity metric).
Kotra does not explicitly teach:
wherein selecting a picture resolution by comparing the first similarity metric comprises selecting a picture resolution by comparing the first similarity metric to at least one threshold.
Shi, however, teaches an apparatus for determining a resolution:
wherein selecting a picture resolution by comparing the first similarity metric comprises selecting a picture resolution by comparing the first similarity metric to at least one threshold (e.g. Figs. 2A and 2B, and col 10, line 42 – col 11, line 51: depicting and describing that when determining whether a picture should be encoded using the original resolution or a reduced resolution, the system compares the distortion with a constant rate factor threshold).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Shi in order to select a picture resolution based at least in part on the first similarity metric, perform an encoding operation with the selected picture resolution, wherein selecting a picture resolution comprises comparing the first similarity metric to at least one threshold. One of ordinary skill in the art would have been motivated to make such a modification because the modification improves accuracy of computing systems that provide digital video and provides more accurate quality across different encoding bitrates (Shi, e.g. col 4, lines 50 – 65: describing a desire to improve the accuracy of computing systems that provide digital video and a desire to provide more accurate quality across different encoding bitrates).
Turning to claim 52, Kotra and Shi teaches all of the limitations of claim 51, as discussed above. Kotra further teaches:
wherein the apparatus is an encoder, decoder, or network node (e.g. section 4: describing that the system is an encoder).
Claim(s) 36, 40, 41, 54, and 55 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kotra et al., “AHG11: Neural Network-based Super Resolution”, 21st Meeting of Joint Video Experts Team (JVET) of ITU-T SG WP 3 and ISO-IEC JTC 1/SC 29, No. JVET-U0099-v2, 6-15 January 2021 (hereinafter Kotra), as cited by applicant, in view of Shi et al. (US 11,553,188) (hereinafter Shi) as applied to claim 34 above, and further in view of Jia (US 8,681,866) (hereinafter Jia).
Regarding claim 36, Kotra and Shi teach all the limitations of claim 34, as discussed above. Kotra does not explicitly teach:
wherein the set of pictures is a group of pictures (GOP) and all pictures in the GOP are encoded at a reduced resolution.
Jia, however, teaches a method for determining a resolution:
wherein the set of pictures is a group of pictures (GOP) and all pictures in the GOP are encoded at a reduced resolution (e.g. col 3, lines 28 – 47: describing that a picture is included in a group of pictures (GOP) and the system encodes all pictures in the GOP at a reduced resolution).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Jia in order for the set of pictures is a group of pictures (GOP) and all pictures in the GOP are encoded at a reduced resolution. One of ordinary skill in the art would have been motivated to make such a modification because the modification reduces the bitrate of the encoded GOP and provides higher quality encoding (Jia, e.g. col 3, lines 40 – 47: describing a desire to reduce the bitrate of encoded GOP and provide higher quality encoding at a target bitrate).
Turning to claim 40, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra does not explicitly teach:
wherein the first source picture is a picture to be intra coded or a picture that will be encoded with a temporal layer id equal to 0.
Jia, however, teaches a method for determining a resolution:
wherein the first source picture is a picture to be intra coded or a picture that will be encoded with a temporal layer id equal to 0 (e.g. Fig. 4, and col 3, lines 28 – 47, col 4, lines 47 – 56, and col 6, lines 48 – 58: depicting and describing that the system determines an encoding resolution based on evaluation of an I-frame, the I-frame is the equivalent of the first source picture, and wherein it is known to those of ordinary skill in the art that an I-frame is necessarily a picture to be intra coded).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Jia in order for the first source picture is a picture to be intra coded or a picture that will be encoded with a temporal layer id equal to 0. One of ordinary skill in the art would have been motivated to make such a modification because the modification reduces the bitrate of the encoded GOP and provides higher quality encoding (Jia, e.g. col 3, lines 40 – 47: describing a desire to reduce the bitrate of encoded GOP and provide higher quality encoding at a target bitrate).
Regarding claim 41, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra does not explicitly teach:
wherein the first source picture corresponds to every Nth picture to be coded in a set and the resolution for encoding N pictures is based on the first source picture in each set of N pictures.
Jia, however, teaches a method for determining a resolution:
wherein the first source picture corresponds to every Nth picture to be coded in a set and the resolution for encoding N pictures is based on the first source picture in each set of N pictures (e.g. col 4, lines 36 – 56: describing that the system determines the encoding resolution for a group of pictures by evaluating the I-frame in the GOP, wherein it is known to those of ordinary in the art that every GOP necessarily starts with an I-frame, and wherein determining the encoding resolution for the GOP based on an evaluation of the I-frame, the I-frame being the first frame in the GOP, is the equivalent of the first source picture corresponds to every Nth picture to be coded in a set and the resolution for encoding N pictures is based on the first source picture in each set of N pictures).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Jia in order for the first source picture corresponds to every Nth picture to be coded in a set and the resolution for encoding N pictures is based on the first source picture in each set of N pictures. One of ordinary skill in the art would have been motivated to make such a modification because the modification reduces the bitrate of the encoded GOP and provides higher quality encoding (Jia, e.g. col 3, lines 40 – 47: describing a desire to reduce the bitrate of encoded GOP and provide higher quality encoding at a target bitrate).
Turning to claim 54, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra further teaches:
wherein comparing the first similarity metric comprises comparing a peak signal to noise ratio (PSNR) (e.g. sections 1 and 4: describing that the system compares the rate distortion of a downsampled picture, the rate distortion being a peak signal to noise ratio [see, e.g. section 1: describing that the system determines a rate distortion by calculating a peak signal to noise ratio (PSNR)], wherein the rate distortion is the equivalent of the first similarity metric).
Kotra, however, does not explicitly teach:
wherein comparing to the at least one threshold comprises comparing to a threshold PSNR value.
Jia, however, teaches a method for determining a resolution:
wherein comparing to the at least one threshold comprises comparing to a threshold PSNR value (e.g. col 1, lines 53 – 63: describing that the system compares a metric to a threshold, the threshold based on a PSNR value).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Jia in order for comparing to the at least one threshold to comprise comparing to a threshold PSNR value. One of ordinary skill in the art would have been motivated to make such a modification because the modification reduces the bitrate of the encoded GOP and provides higher quality encoding (Jia, e.g. col 3, lines 40 – 47: describing a desire to reduce the bitrate of encoded GOP and provide higher quality encoding at a target bitrate).
Regarding claim 55, Kotra, Shi, and Jia teach all of the limitations of claims 34 and 54, as discussed above. Kotra further teaches:
wherein the threshold PSNR value is a threshold for a luma component of the first source picture (e.g. Abstract: describing that the PSNR distortion value is for a luma component of the picture).
Claim(s) 37 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kotra et al., “AHG11: Neural Network-based Super Resolution”, 21st Meeting of Joint Video Experts Team (JVET) of ITU-T SG WP 3 and ISO-IEC JTC 1/SC 29, No. JVET-U0099-v2, 6-15 January 2021 (hereinafter Kotra), as cited by applicant, in view of Shi et al. (US 11,553,188) (hereinafter Shi). as applied to claim 34 above, and further in view of Tourapis et al. (US 2016/0269733) (hereinafter Tourapis).
Regarding claim 37, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra does not explicitly teach:
wherein: comparing the up-scaled picture to the first source picture comprises determining a sum of absolute differences (SAD) or sum of squared differences (SSD) between the pictures.
Tourapis, however, teaches a video encoding method:
wherein: comparing the up-scaled picture to the first source picture comprises determining a sum of absolute differences (SAD) or sum of squared differences (SSD) between the pictures (e.g. par. 56: describing that the system compares the upsampled picture with the input picture by determining a sum of absolute differences (SAD) or a sum of squared errors between the pictures, wherein a sum of squared errors is the equivalent of the sum of square differences).
It therefore would have been obvious to one of ordinary skill in the art to modify the teachings of Kotra by adding the teachings of Tourapis in order for comparing the up-scaled picture to the first source picture to comprise determining a sum of absolute differences (SAD) or sum of squared differences (SSD) between the pictures. One of ordinary skill in the art would have been motivated to make such a modification because the modification allows for improved coding processes resulting in less visual distortion (Tourapis, e.g. par. 17: describing a desire to provide an improved encoding process that results in less visual distortion at the output of a decoder).
Claim(s) 44 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kotra et al., “AHG11: Neural Network-based Super Resolution”, 21st Meeting of Joint Video Experts Team (JVET) of ITU-T SG WP 3 and ISO-IEC JTC 1/SC 29, No. JVET-U0099-v2, 6-15 January 2021 (hereinafter Kotra), as cited by applicant, in view of Shi et al. (US 11,553,188) (hereinafter Shi) as applied to claim 34 above, and further in view of Choi et al. (US 2020/0404269) (hereinafter Choi).
Regarding claim 44, Kotra and Shi teach all of the limitations of claim 34, as discussed above. Kotra does not explicitly teach:
wherein:
selecting the picture resolution is for a block of the source picture;
performing the encoding operation comprises encoding the block with the selected picture resolution wherein: (i) the block is a quarter of a picture; (ii) the block is a central part of a picture; or (iii) the block is a Coding Unit Tree (CTU); and
performing an encoding operation comprises encoding an indication of one or more of: (i) the selected resolution; and (ii) one or more of luma or chroma resolutions.
Choi, however, teaches a method for determining a resolution:
wherein:
selecting the picture resolution is for a block of the source picture (e.g. pars. 103 – 104: describing that the resolution is for a sub-picture of the picture, the sub-picture being a block);
performing the encoding operation comprises encoding the block with the selected picture resolution wherein: (i) the block is a quarter of a picture; (ii) the block is a central part of a picture; or (iii) the block is a Coding Unit Tree (CTU) (e.g. pars. 103 – 104: describing that the system encodes a sub-picture of the picture, the sub-picture being a macroblock, wherein a macroblock is the equivalent of the Coding Tree Unit [CTU]); and
performing an encoding operation comprises encoding an indication of one or more of: (i) the selected resolution; and (ii) one or more of luma or chroma resolutions (e.g. par. 10: describing that coded video data includes adaptive resolution change information of the sub-picture or picture, the adaptive resolution change information including a resolution of the picture).
It therefore would have been obvious to modify the teachings of Kotra by adding the teachings of Choi in order for selecting the picture resolution is for a block of the source picture, performing the encoding operation comprises encoding the block with the selected picture resolution wherein: (i) the block is a quarter of a picture; (ii) the block is a central part of a picture; or (iii) the block is a Coding Unit Tree (CTU), and performing an encoding operation comprises encoding an indication of one or more of: (i) the selected resolution; and (ii) one or more of luma or chroma resolutions. One of ordinary skill in the art would have been motivated to make such a modification because the modification improves coding efficiency.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHANIKA M BRUMFIELD whose telephone number is (571)270-3700. The examiner can normally be reached M-F 8:30 - 5 PM AWS.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Czekaj can be reached at 571-272-7327. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
SHANIKA M. BRUMFIELD
Examiner
Art Unit 2487
/SHANIKA M BRUMFIELD/Examiner, Art Unit 2487