Last updated: April 19, 2026

Application No. 19/010,466

GEOMETRIC TRANSFORM IN NEURAL NETWORK-BASED CODING TOOLS FOR VIDEO CODING

Non-Final OA §102

Filed

Jan 06, 2025

Examiner

JEAN BAPTISTE, JERRY T

Art Unit

2481

Tech Center

2400 — Computer Networks

Assignee

Bytedance Inc.

OA Round

1 (Non-Final)

Interview Optional

— -27.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 572 resolved cases, 2023–2026

Examiner Intelligence

JEAN BAPTISTE, JERRY T View full profile →

Grants 87% — above average

Career Allow Rate

500 granted / 572 resolved

+29.4% vs TC avg

Minimal -28% lift

Without

With

+-27.9%

Interview Lift

resolved cases with interview

Typical timeline

2y 2m

Avg Prosecution

20 currently pending

Career history

592

Total Applications

across all art units

Statute-Specific Performance

§101

6.5%

-33.5% vs TC avg

§103

49.5%

+9.5% vs TC avg

§102

10.6%

-29.4% vs TC avg

§112

12.5%

-27.5% vs TC avg

Black line = Tech Center average estimate • Based on career data from 572 resolved cases

Office Action

§102

DETAILED ACTION
This office action is in response to the application filed on 01/06/2025. Claims 1-20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgement is made of applicant's claim for provisional application No. 63/358,745 filed on 07/06/2022.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 01/27/2025 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors.  Applicant's cooperation is requested in correcting any errors of which applicant may become aware in the specification.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-3, 7-8 and 16-19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Habibian (US 2020/0304804).

	Regarding claim 1, Habibian discloses a method for processing video data (Habibian, abstract methods and apparatus for compressing video content),
comprising: determining, for a conversion between a current video block of a video and a bitstream of the video (Habibian, paragraph 77 discloses operations 600 begin at block 602, where the system receives a compressed version of an encoded video content (e.g., from a transmitting device). The compressed version of the encoded video content may be received, for example, as a bitstream including one or more code words corresponding to one or more codes z representative of a compressed video or portion thereof)
to modify a video unit which is associated with a processing module and performing the conversion based on the determining (Habibian, figure 4, paragraphs 52-64 discloses a code model 404, and an arithmetic coder 406, and the video decompression pipeline in the receiving device 420 includes an auto-encoder 421, code model 424, and arithmetic decoder 426; paragraph 9 discloses decompressing the compressed version of the encoded video content into a latent code space based on a probabilistic model implemented by a first artificial neural network, decoding the encoded video content out of the latent code space through an auto-encoder implemented by a second artificial neural network, and outputting the decoded video content for display).

	Regarding claim 2, Habibian discloses the method of claim 1, wherein the processing module comprises an in-loop processing module or a post-processing module; and the in-loop processing module or the post-processing module comprises at least one neural network (NN) model (Habibian, paragraph 9 discloses decompressing the compressed version of the encoded video content into a latent code space based on a probabilistic model implemented by a first artificial neural network, decoding the encoded video content out of the latent code space through an auto-encoder implemented by a second artificial neural network, and outputting the decoded video content for display).

	Regarding claim 3, Habibian discloses the method of claim 1, wherein the video unit comprises a first video unit and/or a second video unit; the first video unit is modified in a first way before the first video unit is input into the processing module; the second video unit from the processing module is modified in a second way before the second video unit is used as final output or a reference picture; and the first video unit and/or the second video unit is modified by using one or more geometric transforms, and the geometric transforms comprise reflection, rotation, or flipping (Habibian, paragraph 11 discloses receiving a compressed version of an encoded video content; decompressing the compressed version of the encoded video content into a latent code space based on a probabilistic model implemented by a first artificial neural network; decoding the encoded video content out of the latent code space through an auto-encoder implemented by a second artificial neural network; and outputting the decoded video content for display).

	Regarding claim 7, Habibian discloses the method of claim 2, wherein input of the NN model comprises samples from a current video unit and neighbouring video units, and the current video unit and the neighbouring video units are all modified before inputting into the NN model; and wherein modification for the current video unit and the neighbouring video units comprises flipping or rotation (Habibian, paragraph 11 discloses receiving a compressed version of an encoded video content; decompressing the compressed version of the encoded video content into a latent code space based on a probabilistic model implemented by a first artificial neural network; decoding the encoded video content out of the latent code space through an auto-encoder implemented by a second artificial neural network; and outputting the decoded video content for display).

	Regarding claim 8, Habibian discloses the method of claim 1, wherein padding samples and existing samples of the video unit are all modified; or the padding samples are padded after modification is performed on the existing samples (Habibian, paragraph 37 discloses first set of feature maps 218 may be subsampled by a max pooling layer (not shown) to generate a second set of feature maps 220. The max pooling layer reduces the size of the first set of feature maps 218. That is, a size of the second set of feature maps 220, such as 14×14, is less than the size of the first set of feature maps 218, such as 28×28).

	Regarding claim 16, Habibian discloses the method of claim 1, wherein the conversion includes encoding the current video block into the bitstream (Habibian, paragraph 77 discloses operations 600 begin at block 602, where the system receives a compressed version of an encoded video content (e.g., from a transmitting device). The compressed version of the encoded video content may be received, for example, as a bitstream including one or more code words corresponding to one or more codes z representative of a compressed video or portion thereof)

	Regarding claim 17, Habibian discloses the method of claim 1, wherein the conversion includes decoding the current video block from the bitstream (Habibian, paragraph 66 discloses receiving device 420 may include an arithmetic decoder 426, a code model 424, and an auto-encoder 421. Auto-encoder 421 may include an encoder 422 and decoder 423 and may be trained using the same or a different training data set used to train auto-encoder 401 so that decoder 423, for a given input, can produce the same, or at least a similar, output as decoder 403).

	With regard to claim 18, claim 18 discloses the same elements and features to claim 1 as outlined above. Therefore, the same rationale that was utilized in claim 1 applies equally as well to claim 18. In addition, Habibian paragraph 7 discloses at least one processor and a memory coupled to the at least one processor.

With regard to claim 19, claim 19 discloses the same elements and features to claim 1 as outlined above. Therefore, the same rationale that was utilized in claim 1 applies equally as well to claim 19. In addition, Habibian paragraph 7 discloses at least one processor and a memory coupled to the at least one processor.

Examiner’s note: 
Machine readable media: when determining the scope of a claim directed to a computer-readable medium containing certain programming, the examiner should first look to the relationship between the programming and the intended computer system. Where the programming performs some function with respect to the computer with which it is associated, a functional relationship will be found. For instance, a claim to computer-readable medium programmed with attribute data objects that perform the function of facilitating retrieval, addition, and removal of information in the intended computer system, establishes a functional relationship such that the claimed attribute data objects are given patentable weight. See Lowry, 32 F.3d at 1583-84, 32 USPQ2d at 1035.  However, where the claim as a whole is directed to conveying a message or meaning to a human reader independent of the intended computer system, and/or the computer-readable medium merely serves as a support for information or data, no functional relationship exists. For example, a claim to a memory stick containing tables of batting averages, or tracks of recorded music, utilizes the intended computer system merely as a support for the information. Such claims are directed toward conveying meaning to the human reader rather than towards establishing a functional relationship between recorded data and the computer. See section 2111.05 of MPEP.
Claim(s) 20 is rejected under 35 U.S.C. 102(a)(1) as being anticipated by Comer (US 2005/0185937).

Regarding claim 20, Comer discloses a non-transitory computer-readable recording medium storing a bitstream of a video which is generated by a method performed by a video processing apparatus, wherein the method comprises: determining, for a current video block of the video, to modify a video unit which is associated with a processing module; and generating the bitstream based on the determining (Examiner’s note: the “non-transitory computer readable medium” does not establish a functional relationship between the recorded bitstream data and the computer readable medium, therefore the claim will be interpreted as a tangible device being able to store bitstream data; Comer, paragraph 29 discloses the base data bitstream can be recorded onto the DVD as a base layer and assigned a stream identification of 0xE0… the enhancement data bitstream can be recorded onto the DVD as an enhancement layer and assigned a stream identification of 0xBF, 0xFA, 0xFB, 0xFC, 0xFD or 0xFE).

Allowable Subject Matter
Claims 4-6 and 9-15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JERRY T JEAN BAPTISTE whose telephone number is (571)272-6189. The examiner can normally be reached Monday-Friday 9-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Vaughn can be reached at 571-272-3922. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JERRY T JEAN BAPTISTE/Primary Examiner, Art Unit 2481

Read full office action

Prosecution Timeline

Jan 06, 2025

Application Filed

Feb 07, 2026

Non-Final Rejection — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/039,524

Patent 12599364

ULTRASOUND SIMULATION SYSTEM

2y 5m to grant Granted Apr 14, 2026

18/659,067

Patent 12604002

METHOD AND ENCODER FOR ENCODING LIDAR DATA

2y 5m to grant Granted Apr 14, 2026

18/726,585

Patent 12604011

VIDEO SIGNAL ENCODING/DECODING METHOD, AND RECORDING MEDIUM IN WHICH BITSTREAM IS STORED

2y 5m to grant Granted Apr 14, 2026

18/354,448

Patent 12593054

LOW DELAY CONCEPT IN MULTI-LAYERED VIDEO CODING

2y 5m to grant Granted Mar 31, 2026

18/763,702

Patent 12593072

METHOD FOR PICTURE OUTPUT WITH OUTPUT LAYER SET

2y 5m to grant Granted Mar 31, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

87%

Grant Probability

60%

With Interview (-27.9%)

2y 2m

Median Time to Grant

Low

PTA Risk

Based on 572 resolved cases by this examiner. Grant probability derived from career allow rate.

GEOMETRIC TRANSFORM IN NEURAL NETWORK-BASED CODING TOOLS FOR VIDEO CODING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email