DETAILED ACTION
Notice of Pre-AIA or AIA Status
1. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
2. This Office Action is in response to the application filed on 10/24/2024. Claims 1-20 have been examined.
Priority
3. Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)- (d), which papers have been placed of record in the file.
Information Disclosure Statement
4. The information disclosure statement (IDS) submitted on 10/24/2024, 06/27/2025, and 07/10/2025, filed in accordance with the provisions of 37 CFR 1.97. Accordingly, it is being considered by the examiner.
Claim Rejections - 35 USC § 103
5. In the event the determination of the status of the application as subject to AIA 35 U.S.C.
102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction
of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status
The following is a quotation of 35 U.S.C. 103 which forms the basis for all
obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention
is not identically disclosed as set forth in section 102, if the differences between the claimed
invention and the prior art are such that the claimed invention as a whole would have been obvious
before the effective filing date of the claimed invention to a person having ordinary skill in the art
to which the claimed invention pertains. Patentability shall not be negated by the manner in which
the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35
U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness
6. Claims 1-8, and 11-18 are rejected under 35 U.S.C. 103 as being unpatentable over WANG; Zhao (US-20210329286-A1) hereinafter “Wang” in view of Ma, Si-wei (CN-108184129-A) (translation provided and citation given from the translated document) hereinafter “Ma”.
Regarding Claim 1 Wang-MA
Wang disclose 1. A picture filtering method (Wang, Abs, “…provides methods for convolutional-neural-network (CNN) based filter for video coding,…”) performed by an electronic device, (Wang, [0013] FIG. 4 is a block diagram of an exemplary apparatus for encoding or decoding a video,…”) comprising:
obtaining a picture to be filtered; (Wang, [0081] “…each CTU is treated as an independent region and each CTU can select different CNN models….”)
determining a neural network filter; (Wing, [0081] “…,convolutional neutral networks (CNN) based image/video compression and CNN based loop/post filters can be used…”)
. . . and
filtering the one or more picture blocks based on the neural network filter to obtain a filtered picture. (Wang, [0090] “…applying the CNN filter on the whole video frames, …the control decision can be applied on the frame level, CTU level or filter unit level.” i.e., applying the CNN filter at the CTU level or filter-unit (block) level, which corresponds to filtering one or more picture blocks using the neural network filter to obtain a filter output picture.)
Wang does not explicitly disclose
dividing the picture to be filtered according to a blocking mode corresponding to the neural network filter to obtain one or more picture blocks to be filtered, the blocking mode being a same blocking mode for a training picture used in training the neural network filter;
However, in the same field of endeavor MA discloses more explicitly the following:
dividing the picture to be filtered according to a blocking mode corresponding to the neural network filter to obtain one or more picture blocks to be filtered, the blocking mode being a same blocking mode for a training picture used in training the neural network filter; (Ma, [0067] “when the image block uses coding tree units as the basic unit, the original image can be divided into multiple non-overlapping CTUs according to the CTU size in the video coding framework. At this point, for each CTU, all pixels of that CTU can be normalized and then used as input to a deep neural network (i.e., the at least one first filtering neural network or the at least one second filtering neural network) for filtering, and the trained deep convolutional network can be invoked for filtering. Similarly, training data also needs to be normalized during training.”)
Therefore, it would have been obvious to a person having ordinary skill in the art before
the effective filing date of the application to modify the teachings of Wang in view MA to divide “the picture to be filtered according to a blocking mode corresponding to the neural network filter to obtain one or more picture blocks to be filtered, the blocking mode being a same blocking mode for a training picture used in training the neural network filter;” as suggested by MA.
One of ordinary skill in the art would have been motivated to incorporate MA’s blocking -mode-based division into Wang “to improve encoding and decoding performance.” (MA, [0048])
Note: The motivation that was utilized in the rejection of claim 1, applies equally as well to claims 1-8, and 11-18.
Regarding Claim 2 Wang-MA
Wang-MA discloses 2. The method according to claim 1,
wherein the blocking mode for the training picture comprises determining one or more coding tree units (CTUs) in the training picture as a training picture block; . (Wang, [0093], “In some embodiments, the target coding block may be a coding tree unit (CTU). In some embodiments, the target coding block may also refer to a filter unit filtered by the CNN filter…”i.e., the target coding block is the unit filtered by the CNN filter. Thus determining CTUs in the training picture corresponds to determining the training picture blocks used for training the neural network filter.) and
wherein the dividing the picture to be filtered comprises determining the one or more picture blocks from the picture to be filtered in a mode of using one or more CTUs as a picture block to be filtered. (Ma, [0067] “when the image block uses coding tree units as the basic unit, the original image can be divided into multiple non-overlapping CTUs according to the CTU size in the video coding framework. At this point, for each CTU, all pixels of that CTU can be normalized and then used as input to a deep neural network (i.e., the at least one first filtering neural network or the at least one second filtering neural network) for filtering, and the trained deep convolutional network can be invoked for filtering. Similarly, training data also needs to be normalized during training...”)
Regarding Claim 3 Wang-MA
Wang-MA discloses 3. The method according to claim 1,
wherein the blocking mode for the training picture comprises determining one or more residual coding tree units (CTUs) in the training picture as a training picture block, (Wang, [0102] “…, before the residual associated with the target coding block is determined, the CNN filter can be trained based on a training data set. The training data set can include a training block and extended region of the training block.” and see also [0093] identifies the target coding block is as a CTU. )
wherein the dividing the picture to be filtered comprises determining the one or more picture blocks from the picture to be filtered in a mode of using one or more residual CTUs as a picture block to be filtered. (MA, [0067] explains that, in HEVC, the original image is divided into multiple non-overlapping CTUs according to the CTU size in the video coding framework. Because Wang processes each CTU to obtain the residual for that CTU, the CTUs resulting from this dividing correspond to residual CTUs, satisfying the dividing requirements of claim. )
Regarding Claim 4 Wang-MA
Wang-MA discloses 4. The method according to claim 2,
wherein the neural network filter is obtained by training an extended picture block of the training picture block, (Wang, [0102] “The extended regions can be used in the training of the CNN filter. …The training data set can include a training block and extended region of the training block. “and wherein the filtering the one or more picture blocks comprises:
for each of the one or more picture blocks, extending the picture block to be filtered according to an extension mode for the training picture block to obtain the extended picture block; (Wang, [0089] “an extended block can be used … The extended regions include the neighboring samples around the target image block. Additionally, [0090] supports extension during filtering.)
filtering the extended picture block based on the neural network filter to obtain a filtered extended picture block; (Wang, [0089] “During the filtering process, …the samples in the target image block are filtered and the samples in the extended region of the target image block are kept unchanged”)
determining a picture region corresponding to the picture block to be filtered in the filtered extended picture block as a filtered picture block corresponding to the picture block to be filtered. (Wang’s filtering keeps the extended region unchanged and filters only the target image block: Wang, [0089] “ …only the samples in the target image block are filtered and the samples in the extended region …are kept unchanged.”)
Regarding Claim 5 Wang-MA
Wang-MA discloses 5. The method according to claim 4,
wherein the extension mode comprises extending at least one first boundary region of the training picture block outwards, and wherein the extending the picture block to be filtered comprises extending at least one second boundary region of the picture block to be filtered outwards to obtain the extended picture block. (Wang, [0089] teaches boundary extensions by stating that “four more rows and four more columns of the target image block and reference image block are used as the extended regions.” and that “the extended regions include the neighboring samples around the target image block and reference image block.” i.e., it discloses outward extension of boundary regions to obtain the extended picture regions.)
Regarding Claim 6 Wang-Ma
Wang-MA discloses 6. The method according to claim 1,
wherein the training picture comprises an input picture, wherein input data in the training the neural network filter comprises an input picture block and a first reference picture block of the input picture block, wherein the input picture block is obtained by performing picture division on the input picture based on the blocking mode, and wherein the filtering the one or more picture blocks (Wang, [0100] “In some embodiments, the first convolutional layer of the CNN filter can be used to extract spatial features of the image data associated with the target coding block and the reference block…. (FIG. 7) to image data associated with the target coding block, to generate feature maps associated with the target coding, … (FIG. 7) to image data associated with the reference block, to generate feature maps associated with the reference block. The spatial features extracted from the target coding block and reference block can be fused and input to the subsequent convolutional layers for further processing…” see that Figures 7-9) comprises:
for each of the one or more picture blocks, determining a second reference picture block of a picture block to be filtered; (Wang discloses applying motion estimation to a target coding block to determine reference block(s) from one or more reference pictures ([¶0092]-[0095]). Since motion estimation is performed with respect to available reference pictures, Wang inherently determine a second reference block corresponding to a second reference picture.) and
inputting the picture block to be filtered and the second reference picture block into the neural network filter to obtain a filtered picture block of the picture block to be filtered. (Wang, Figure, 9, discloses inputting image data associated with target coding block and the reference block to a CNN filter ([0096]-[0100]) [0096] “In step 904, image data associated with the target coding block and the reference block is input to the disclosed CNN filter.” [0099] “In step 906, the CNN filter is executed to determine a residual associated with the target coding block based on the input image data.” ….)
Regarding Claim 7 Wang-Ma
Wang-Ma discloses 7. The method according to claim 6, wherein the determining the second reference picture block comprises:
obtaining a determining mode for the first reference picture block; (Wang, [0052] “FIG. 2B.. additionally includes mode decision stage 230…” [0059] “...at mode decision stage 230, … can select a prediction mode (e.g., one of the intra prediction or the inter prediction) and
determining the second reference picture block according to the determining mode, and wherein the determining mode determines the first reference picture block according to at least one of: spatial domain information of the input picture block or time domain information of the input picture block. (Wang, [0067] “Based on the prediction mode indicator, the decoder can decide whether to perform a spatial prediction (e.g., the intra prediction) at spatial prediction stage 2042 or a temporal prediction (e.g., the inter prediction) at temporal prediction stage 2044. The details of performing such spatial prediction or temporal prediction are described in FIG. 2B…”)
Regarding Claim 8 Wang-MA
Wang-Ma discloses 8. The method according to claim 7,
wherein the first reference picture block comprises at least one of:
a first temporal reference picture block of the input picture block
or a first spatial reference picture block of the input picture block, (Wang, [0084] “The present disclosure provides methods of spatial-temporal block based CNN filter…” [0087] “
In some disclosed embodiments, the temporal reference image block is obtained by motion estimation in one or both of the training or the inference procedure.…”) and
wherein the second reference picture block comprises at least one of:
a second temporal reference picture block of the picture block (Wang, [0058] “Bidirectional inter predictions can use one or more reference pictures at both temporal directions …”) to be filtered or a second spatial reference picture block of the picture block to be filtered. (Wang, [0054] “… prediction reference 224 can include one or more neighboring BPUs … BPU from various directions…”)
Regarding Claim 11 Wang-MA
Wang discloses 11. A picture filtering apparatus (Wang, Fig. 4, 400,” apparatus” [0069] “… apparatus 400 for encoding or decoding ,..”) comprising:
at least one memory (Wang, Fig. 4, 404 “Memory”) configured to store computer program code (Wang, [0069]“…memory 404 configured to store data (e.g., a set of instructions, computer codes, intermediate data, or the like)”)and
at least one processor (Wang, Fig. 4, 402 “Processor”) configured to read the program code and operate as instructed by the program code, the program code (Wang, “As shown in FIG. 4, apparatus 400 can include processor 402. When processor 402 executes instructions described herein, apparatus 400 can become a specialized machine for video encoding or decoding. Processor 402 can be any type of circuitry capable of manipulating or processing information.” comprising:
The remaining limitations of independent claim 11 recite features that are substantially similar to those set forth in independent claim 1. Accordingly, the reasoning and analysis provided with respect to claim 1 apply equally to claim 11.
Regarding Claim 20 Wang-Ma
Wang discloses A non-transitory computer-readable storage medium, storing computer code which, when executed by at least one processor, causes the at least one processor (Wang, [0006] “…provide a non-transitory computer readable medium that stores a set of instructions that is executable by one or more processors of an apparatus to cause the apparatus to initiate a method for CNN based in-loop filter in video processing, to at least:
The remaining limitations of independent claim 20 recite features that are substantially similar to those set forth in independent claim 1. Accordingly, the reasoning and analysis provided with respect to claim 1 apply equally to claim 20.
Claim Rejections - 35 USC § 103
7. Claims 9-10, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over WANG-Ma in view of Na et al. (US-20210021823-A1) herein after “Na”.
Regarding Claim 9 Wang-MA-Na
Wang-MA discloses 9. The method according to claim 8,
Wang-MA does not explicitly disclose
wherein the determining the second reference picture block comprises determining the second spatial reference picture block according to a first determining mode for the first spatial reference picture block.
However, in the same field of endeavor Na discloses more explicitly the following:
wherein the determining the second reference picture block comprises determining the second spatial reference picture block according to a first determining mode for the first spatial reference picture block. (Na discloses [0185]. “.... “input data” may be composed of a reference region encoded before the current block. [0186] “The reference region may include at least one block (or region) of a neighboring region adjacent to the current block…” See also Fig. 15(a) neighboring block [0188] “…showing top-left block A, top-block B, and left block C…” [0208] “The input data may be configured in various combinations according to the directionality of intra-prediction. For example when the directionality of intra-prediction is a horizontal direction, the input data may be composed of one or more blocks selected from among the left neighboring blocks Ar, Cr, and Er ….”.)
Therefore, it would have been obvious to a person having ordinary skill in the art before
the effective filing date of the application to modify the teachings of Wang-MA in view of Na to determine “the second spatial reference picture block according to a first determining mode for the first spatial reference picture block.” as suggested by Na.
One of ordinary skill in the art would have been motivated to incorporate Na’s teaching of selecting spatial reference based on predication direction into Wang-Ma in order to “improve the accuracy of the training process and inference process of the CNN” (Na, [00189])
Note: The motivation that was utilized in the rejection of claim 9, applies equally as well to claim 10 and 19.
Regarding Claim 10 Wang-MA-Na
Wang-MA-Na discloses 10. The method according to claim 9,
wherein the first spatial reference picture block comprises at least one of:
a first upper left picture block of the input picture block, a first left-side picture block of the input picture block, or a first upper picture block of the input picture block, (Na, [0188] “Referring to FIG. 15(a), the reference region in units of blocks, that is, neighboring blocks, may include a left block C, a top block B, a top-right block D, a bottom-left block E, and a top-left block A, which are adjacent to a current block X. In this specification, the original block (i.e., unencoded block), prediction block, and reconstructed block of a neighboring block are denoted differently. For example, for the top-left block A, ...”) and
wherein the determining the second spatial reference picture block comprises determining at least one of:
a second upper left picture block of the picture block to be filtered, a second left-side picture block of the picture block to be filtered, or a second upper picture block of the picture block to be filtered as the second spatial reference picture block. (Na,[0198] “FIG. 17 exemplarily illustrates a prediction direction suitable for a current block in light of the shape of pixel values of neighboring blocks. “[0199] “In FIG. 17, the neighboring blocks of the current block X includes a top-left block A, a top block B, and a left block C.”)
Regarding Claim 12-19
Claims 12-19 recite limitations that are substantially similar to those of dependent claims 2-9, respectively, except that claims 12-19 are directed to apparatus rather than a method. Accordingly, the reasoning and rejection set forth with respect to claims 2-9 apply equally to claims 12-19.
Conclusion.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ASTEWAYE GETTU ZEWEDE whose telephone number is (703)756-1441. The examiner can normally be reached Mo-Fr 8:30 am to 5:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Vaughn can be reached on (571)272-3922. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ASTEWAYE GETTU ZEWEDE/Examiner, Art Unit 2481 /WILLIAM C VAUGHN JR/Supervisory Patent Examiner, Art Unit 2481
.