Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Election/Restrictions
Applicant’s election without traverse of Group I, claims 1-5 and 17-20 in the reply filed on 2/19/2026 is acknowledged. Claims depending on an allowable independent claim shall be rejoined.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1-5, 17-19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Lu (NPL “Transformer-based Image Compression,” arXiv 2021).
Regarding Claim 1, Lu (NPL “Transformer-based Image Compression,” arXiv 2021) discloses a method of video processing (Transformer-based image compression, title), comprising:
during a conversion (compression, Abstract, arithmetic encoding (AE), Fig. 2, arithmetic decoding (AD), Fig. 2, reconstruct pixel blocks, Section 3 page 4) between a video unit (input x, Section 3 page 4; input image, Section 3.2 page 5) of a video (video, Abstract) and a bitstream (bits, Fig. 2) of the video (video, Abstract), applying a signal process (transforms the input x into the latent features y, Section 3.1 page 4, using 3 NTUs, Section 3.2 page 5) to the video unit (input x, Section 3 page 4; input image, Section 3.2 page 5) based at least in part on a window-based attention module (Swin transformer block, Section 1.2 page 2; having window attention and shifted window attention, Section 3.2 page 6);
and performing the conversion (arithmetic encoding/decoding, Fig. 2, Section 3.2 page 5) based on the processed video unit (quantize y into y-hat and entropy code y-hat, Section 3.1 page 4).
Regarding Claim 2, Lu (NPL “Transformer-based Image Compression,” arXiv 2021) discloses the method of claim 1, wherein the signal process comprises a restoration of the video unit,
or wherein the window-based attention module is applied to a compression of the video unit, and the compression comprises a non-learning based compression and a learning based compression,
or wherein the window-based attention module is applied to a super-resolution of the video unit,
or wherein the window-based attention module is applied to an in-loop filtering in the compression of the video unit,
or wherein the window-based attention module is applied to at least one of: a pre-processing or a post-processing of the video unit,
or wherein the window-based attention module is applied in a compression framework (transforming input x into features y is part of image compression, Section 3.1 page 4).
The broadest reasonable interpretation of this series of disjunctive clauses is that only one of them is performed.
Regarding Claim 3, Lu (NPL “Transformer-based Image Compression,” arXiv 2021) discloses the method of claim 1, wherein applying the signal process comprises:
applying the signal process (using 3 NTUs, Section 3.2 page 5) to the video unit (on the input image x, Fig. 2, Section 3.1, 3.2) based on a combination of the window-based attention module and a convolutional network (each NTU has a Swin transformer block and a convolutional layer, Section 1.2 page 2).
Regarding Claim 4, Lu (NPL “Transformer-based Image Compression,” arXiv 2021) discloses the method of claim 3, wherein a convolution layer in the convolutional network is replaced by a layer of the window-based attention module (using the VAE architecture, use the NTU as the basic module, the NTU having a Swin transform block and a convolutional layer, Section 2.1 page 2),
or wherein a portion of modules in the convolutional network is replaced by the window-based attention module (using the VAE architecture, use the NTU as the basic module, the NTU having a Swin transform block and a convolutional layer, Section 2.1 page 2).
The broadest reasonable interpretation of this series of disjunctive clauses is that only one of them is performed.
Regarding Claim 5, Lu (NPL “Transformer-based Image Compression,” arXiv 2021) discloses the method of claim 4, wherein the convolutional network comprises a convolution-based compression network (each NTU has a Swin transformer block and a convolutional layer, Section 1.2 page 2), and a convolution layer of the convolution-based compression network is replaced by the layer of the window-based attention module (the VAE architecture being a number of convolutional layers stacked, Section 1.1 page 1; using the VAE architecture, use the NTU as the basic module, the NTU having a Swin transform block and a convolutional layer, Section 2.1 page 2),
or wherein the convolutional network comprises a convolution-based super resolution network, and a convolution layer of the convolution-based super resolution network is replaced by the layer of the window-based attention module,
or wherein the convolutional network comprises the convolution-based compression network, and an encoder in the convolution-based compression network is replaced by the window-based attention module,
or wherein the convolutional network comprises the convolution-based compression network, and a decoder in the convolution-based compression network is replaced by the window-based attention module,
or wherein the convolutional network comprises a super resolution network, and a residual block in the super resolution network is replaced by the window-based attention module.
The broadest reasonable interpretation of this series of disjunctive clauses is that only one of them is performed.
Regarding Claim 17, Lu (NPL “Transformer-based Image Compression,” arXiv 2021) discloses the method of claim 1, wherein the conversion includes encoding the video unit into the bitstream, or wherein the conversion includes decoding the video unit from the bitstream (arithmetic encoding/decoding, Fig. 2, Section 3.2 page 5).
Regarding Claim 18, Lu (NPL “Transformer-based Image Compression,” arXiv 2021) discloses a method of video processing (Transformer-based image compression, title) discloses an apparatus for video processing comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor (TIC is implemented on top of an open-source CompressAI PyTorch Library, Section 4 page 7) …. The remainder of Claim 18 is rejected on the grounds provided in Claim 1.
Regarding Claim 19, Lu (NPL “Transformer-based Image Compression,” arXiv 2021) discloses a method of video processing (Transformer-based image compression, title) discloses a non-transitory computer-readable storage medium storing instructions that cause a processor (TIC is implemented on top of an open-source CompressAI PyTorch Library, Section 4 page 7) …. The remainder of Claim 19 is rejected on the grounds provided in Claim 1.
Claim(s) 20 is rejected under 35 U.S.C. 102(a)(2) as being anticipated by Ge (US 20230396810 A1).
Regarding Claim 20, Ge (US 20230396810 A1) discloses a non-transitory computer-readable recording medium storing a bitstream (encoded bitstream 21 may be transmitted to the video decoder 30, or may be stored in a memory [0170) …. The remainder of Claim 20 has no patentable weight. See MPEP 2112.01.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20220239944 A1 – auto-encoder with self-attention
US 12120348 B2 - auto encoder with a shifted window self-attention block (Swin)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHADAN E HAGHANI whose telephone number is (571)270-5631. The examiner can normally be reached M-F 9AM - 5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jay Patel can be reached at 571-272-2988. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SHADAN E HAGHANI/Examiner, Art Unit 2485