DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of the Claims
Claims 1, 6, 8-10, 23, 28, 31-32 44-45, 47-49 and 51-56 are pending. Claims 1, 23, 44 and 48 are amended. Claims 2-5, 7, 11-22, 24-27, 29-30, 33-43, 46, and 50 have been cancelled.
Response to Arguments / Amendments
Rejections under 35 U.S.C. § 103:
Applicant’s arguments have been fully considered but are rendered moot in view of the new ground of rejection necessitated by amendments initiated by the applicant.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 6, 8-10, 23, 28, 31-32 44-45, 47-49 and 51-56 are rejected under 35 U.S.C. 103 as being unpatentable over Aytekin et al. ( "Block-optimized Variable Bit Rate Neural Image Compression", ARXIV.ORG, May 2018, , hereinafter Aytekin) in view of Hannuksela et al. (US 20190268599, hereinafter Hannuksela) and Lew et al. (US10594338 B1, hereinafter Lew) further in view of Coelho et al. ( US 20200275101, hereinafter Coelho_101).
Regarding Claim 1, Aytekin discloses a method for video encoding, comprising:
accessing a picture, said picture partitioned into a plurality of blocks using different block sizes (Abstract, end-to-end block-based auto-encoder system for image compression with neural-network based image compression, mainly in achieving binarization simulation, variable bitrates with multiple networks, entropy friendly representations, inference-stage code optimization and performance-improving normalization layers in the auto-encoder; Section 1, a system for block-based image compression using auto-encoders- block-based neural auto-encoders; Section 2)
forming a first channel of an input based on at least a block of said picture (Section 2, 2.1, block codes are 1-dimensional, as the input to the network is of size 32x32 blocks from the image);
applying a neural network to said input to form output coefficients, said neural network having a plurality of network layers (Section 2.1, auto-encoder neural-network based image compression, mainly in achieving binarization simulation, variable bitrates with multiple networks, entropy friendly representations, inference-stage code optimization and performance-improving normalization layers in the auto-encoder; Section 2.2, the image is divided into 32x32 blocks by raster-scan during encoding and each block is encoded by the lowest bit rate neural network out of three [selecting one of the neural networks]which satisfies a target PSNR),
wherein each network layer of said plurality of network layers performs linear and non-linear operations (Section 2.1, encoder part contains five consecutive convolutional blocks that consists of a convolutional layer with stride2 followed by a parametric rectified linear unit (PReLU)); and
entropy encoding said output coefficients (Section 2.2, each image is encoded into three vectors: 1)entropy-coded image-code 2)entropy coded indicator vector and 3)shape of the original image).
Aytekin does not explicitly disclose forming at least a second channel of said input based on at least a reconstructed neighboring block of said block, wherein said at least a reconstructed neighboring block of said block is mirrored when forming said at least a second channel, and wherein tensor concatenation is used for form said input from said first channel and said at least a second channel.
Hannuksela teaches forming at least a second channel of said input based on at least a reconstructed neighboring block of said block, wherein said at least a reconstructed neighboring block of said block is mirrored when forming said at least a second channel ([0292], resampled bottom stripe is mirrored vertically or rotated by 180 degrees and arranged into a third constituent frame partition, forming its effective picture area;[0295], FIGS. 14a and 14b, FIG. 15b, arranging the resampled top and bottom stripes and the middle stripe into constituent frame partitions and the additional block row is formed by rotating by 180 degrees the top row of the left part above the right part and vice versa, and likewise rotating by 180 degrees the bottom row of the left part below the right part and vice versa. These additional block rows are marked by “left-right mirroring”).
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of neighboring block of is mirrored when forming input as taught by Hannuksela ([0295]) into the encoding/decoding system of Aytekin in order to provide encoding systems that discards some information in an original video sequence to represent the video in a more compact forms (Hannuksela, [0061]) and resulting in the predictable result of improving coding efficiency.
Aytekin & Hannuksela do not explicitly disclose wherein tensor concatenation is used for form said input from said first channel and said at least a second channel.
Lew teaches wherein tensor concatenation is used for form said input from said first channel and said at least a second channel (col. 3, ll. 57-63, the input tensor associated with image content may be generated by a neural network architecture, such as by an encoding portion of an autoencoder; Col. 4, ll. 22-27, the encoding system 130 include a separate autoencoder network comprising sub-encoder and a sub-decoder portions and the sub-encoder receives the pre-quantized tensor as input and the sub-decoder outputs a predicted probability distribution at each tensor element).
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of neighboring block of is mirrored when forming input as taught by Lew Col. 4, ll. 22-27) into the encoding/decoding system of Aytekin in order to provide encoding system is enhanced by applying information from a quantization mask used for compressing data at each spatial location of the tensor and hence the compressed code is effectively transmitted to the receiver, since the sender encodes the information into a compressed code (Lew, Col. 8, ll. 63-67 ).
Aytekin, Hannuksela & Lew do not explicitly disclose said input further includes information indicating pixel locations in said block
Coelho_101 teaches said input further includes information indicating pixel locations in said block ([0132], the left and top neighboring pixels of the block for which a partitioning is to determined can be included in the input block to the CNN 1000. [0141], since the CNN 1000 is used to determine (e.g., infer, provide, etc.) a partition for the block 1002 that is to be intra-predicted, and as intra-prediction uses at least some samples (i.e., pixels) of neighboring blocks, at least some samples of the neighboring blocks can also be used as input to the concatenation layer 1016)
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of neural network based on a block size of said at least a block as taught by Coelho_101 ([0004]) into the encoding/decoding system of Aytekin in order to provide encoding system for machine-learning model closely matches the brute-force approach in coding efficiency but at a significantly lower computational cost or with a regular or dataflow-oriented computational cost and provides increased speed and efficiency using more than one processor, and the computational complexity in mode decision is reduced by machine learning (Coelho_101, [0003]).
Regarding Claim 6, Aytekin in view of Hannuksela and Lew further in view of Coelho_101 discloses the method of claim 1. Hannuksela discloses wherein a top neighboring block of said block is mirrored vertically when forming said at least a second channel, or wherein a left neighboring block of said block is mirrored horizontally when forming said at least a second channel ([0294], FIG. 15a, additional block row is located above and below the effective picture area and the top block row of the left constituent frame partition is rotated by 180 degrees to form a reference signal above the right constituent frame partition. The top block row of the right constituent frame partition is rotated by 180 degrees to form a reference signal above the left constituent frame partition. The bottom block row of the left constituent frame partition is rotated by 180 degrees to form a reference signal below the right constituent frame partition). The same reason or rational of obviousness motivation applied as used above in claim 1.
Regarding Claim 8, Aytekin in view of Hannuksela and Lew further in view of Coelho_101 discloses the method of claim 1. Hannuksela discloses wherein a top-left neighboring block of said block is mirrored horizontally and vertically when forming said at least a second channel ([0177], Resampling results in a new image which is represented with different number of pixels in horizontal or/and vertical direction[ 0295], FIGS. 14a and 14b, FIG. 15b, arranging the resampled top and bottom stripes and the middle stripe into constituent frame partitions and the additional block row is formed by rotating by 180 degrees the top row of the left part above the right part and vice versa). The same reason or rational of obviousness motivation applied as used above in claim 1.
Regarding Claim 9, Aytekin in view of Hannuksela and Lew further in view of Coelho_101 discloses the method of claim 1. Aytekin discloses wherein parameters for said plurality of network layers are further based on at least one of a block shape of said block (Section 1, hybrid encoding and decoding with neural network bases approach [with block partition and concatenation]; Section 2)).
Regarding Claim 10, Aytekin in view of Hannuksela and Lew further in view of Coelho_101 discloses the method of claim 1. Aytekin wherein said block is extended to form said input (Section 1, hybrid encoding and decoding with neural network bases approach [with input having block partition).
Regarding Claim 53, Aytekin in view of Hannuksela and Lew further in view of Coelho_101 discloses the method of claim 1. Aytekin wherein further comprising: selecting a neural network from a plurality of neural networks, based on a block size of said block, wherein said plurality of neural networks correspond to said different block sizes ([0129] FIG. 10, convolutional neural network (CNN) 1000 for determining a block partition of an image block. The block can be a superblock. For example, the CNN can be used to determine the block size used in the intra/inter-prediction stage 402 of FIG. 4. The partition can be a quad-tree partition, such as described with respect to FIG. 7. The CNN 1000 can be used to determine a partition for an intra-coded block. As such, the block can be a block of intra-coded frame, such as the frame 304 of FIG. 3. The CNN 1000 can be used by an encoder where the smallest possible block partition is an 8×8 partition. As such, determinations of whether to split a block need be made for blocks (i.e., sub-blocks of the superblock) that are 16×16 or larger).
The same reason or rational of obviousness motivation applied as used above in claim 1.
Regarding Claims 23, 28, 31-32, 54 Decoding method claims 23, 28, 31-32, 54 of using the corresponding encoding method claimed in claims 1, 6, 8-10, 13, 53 and the rejections of which are incorporated herein for the same reasons of obviousness as used above.
Regarding Claims 44-45 & 47, 55 Encoder claims 44-45 & 47, 55 of using the corresponding encoding method claimed in claims 1, 6, 8-10, 13, 53 and the rejections of which are incorporated herein for the same reasons of obviousness as used above.
Regarding Claims 48-49 & 52, 56 Decoder claims 48-52 of using the corresponding decoding method claimed in claims 23, 28, 31-32, 54 and the rejections of which are incorporated herein for the same reasons of obviousness as used above.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Samuel D Fereja whose telephone number is (469)295-9243. The examiner can normally be reached 8AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, DAVID CZEKAJ can be reached at (571) 272-7327. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SAMUEL D FEREJA/Primary Examiner, Art Unit 2487