Last updated: April 19, 2026

Application No. 18/654,409

ENHANCEMENT VIDEO CODING FOR VIDEO MONITORING APPLICATIONS

Non-Final OA §103§112

Filed

May 03, 2024

Examiner

BENNETT, STUART D

Art Unit

2481

Tech Center

2400 — Computer Networks

Assignee

Axis AB

OA Round

1 (Non-Final)

Interview Optional

— -15.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 355 resolved cases, 2023–2026

Examiner Intelligence

BENNETT, STUART D View full profile →

Grants 69% — above average

Career Allow Rate

245 granted / 355 resolved

+11.0% vs TC avg

Minimal -15% lift

Without

With

+-15.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 5m

Avg Prosecution

31 currently pending

Career history

386

Total Applications

across all art units

Statute-Specific Performance

§101

4.7%

-35.3% vs TC avg

§103

48.4%

+8.4% vs TC avg

§102

12.7%

-27.3% vs TC avg

§112

22.1%

-17.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 355 resolved cases

Office Action

§103 §112

DETAILED ACTION
The present Office action is in response to the application filing on 3 MAY 2024.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The Information Disclosure Statement (IDS) submitted on 05/03/2024 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the Information Disclosure Statement is being considered by the Examiner.

Claim Objections
Claims 1, 14, and 15 are objected to because of the following informalities:
Each of claims 1, 14, and 15 recite “generating a set of second residuals based on a difference between the input video and a reconstructed video at the original spatial resolution” should read --generating a set of second residuals based on a difference between the input video at the original spatial resolution and a reconstructed video at the original spatial resolution--
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
With regard to claims 1, 14, and 15, the set of first residuals are made by a difference between two difference resolution, which is unclear how to resolve. It appears to possibly be a typographical error as FIG. 1A illustrates the difference is of the same resolution. For examination purposes, the limitation will be interpreted as […] difference between the input video at the intermediate spatial resolution and a reconstructed video at the intermediate spatial resolution.
With regard to claim 3, the limitation appears to cause the first residuals to vanish by replacing video, which would be prior to the first residuals having been generated, and therefore it is unclear how there can be a vanishing step on residuals that have not been generated. For examination purposes, the limitation is interpreted as providing feedback to re-encode the first enhancement layer with the difference of having replaced the video to produce new first residuals that do not include the “vanished” residuals.
With regard to claim 6, a similar issue as claim 3 occurs, where the vanishing of the residuals happens prior to generating the set of first residuals and it is unclear how there can be a vanishing step of residuals that have not been generated. For examination purposes, the limitation is interpreted as comprising a plurality of steps, one of which is the claimed subtraction.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 2, 5-11, and 13-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Publication No. 2016/0065976 A1 (hereinafter “He”) in view of U.S. Publication No. 2022/0191521 A1 (hereinafter “Ferrara”).
Regarding claim 1, He discloses a method of encoding an input video including a sequence of video frames as a hybrid video stream (FIG. 5 depicts encoder 500 with BL encoder 508 and EL encoder 516 for encoding the input video and then multiplexing the two output streams (e.g., hybrid video stream)), wherein the method comprises: 
downsampling the input video from an original spatial resolution to a reduced spatial resolution (FIG. 1, down-sampling 101. [0039], “The spatial and/or temporal signal resolution to be represented by the layer 1 (e.g., a base layer) may be generated by down-sampling of the input video signal 120 at a down-sampler 101”) and an intermediate spatial resolution (FIG. 1, down-sampling 102. [0039], “the input video signal 120 may be downsampled at down-sampler 102.” Note, layer N is up-sampled to the resolution of layer 2 and layer 2 is up-sampled to the resolution of layer 1, meaning layer 2 is the intermediate spatial resolution); 
providing the input video at the reduced spatial resolution to a base encoder to obtain a base encoded stream (FIG. 1, encoder layer 1 121 receives the reduced spatial resolution from down-sampling 101 and produces encoded base layer bitstream 122. FIG. 5, EL encoder 508 and base layer bitstream 510); 
providing a first enhancement stream by: 
generating a set of first residuals based on a difference between the input video [at the intermediate spatial resolution] and a reconstructed video at the intermediate spatial resolution (FIG. 1, sum 124. [0039], “the input video signal 120 may be downsampled at down-sampler 102 and then the upsampled base layer reconstruction signal may be subtracted from the downsampled input video signal at 124 to generate a difference signal.” Note, the difference signal is the set of first residuals); 
quantizing the set of first residuals (FIG. 1, encoder layer 2 125 and Q2. [0039], “The difference signal may be encoded at a layer-2 encoder 125 to create a layer-2 bitstream 126.” [0055], “The encoder 300 may transform and quantize the prediction residual 318 at a transformation unit 304 and a quantization unit 306, respectively. By transforming a quantizing the prediction residual 318, the encoder 300 may generate a residual coefficient block 324. The residual coefficient block 324 may be referred to as a quantized residual.” Note, the encoder of FIG. 3 represents the encoding functionalities of each encoder in FIG. 1); and 
forming the first enhancement stream from the set of quantized first residuals ([0039], “The difference signal may be encoded at a layer-2 encoder 125 to create a layer-2 bitstream 126.” [0055], “The encoder 300 may transform and quantize the prediction residual 318 at a transformation unit 304 and a quantization unit 306, respectively. By transforming a quantizing the prediction residual 318, the encoder 300 may generate a residual coefficient block 324. The residual coefficient block 324 may be referred to as a quantized residual.” FIG. 3, quantization 306 and bitstream 320. Note, the encoder of FIG. 3 represents the encoding functionalities of each encoder in FIG. 1); 
providing a second enhancement stream by:
generating a set of second residuals based on a difference between the input video and a reconstructed video at the original spatial resolution (FIG. 1 illustrates encoder layer n receiving the original video signal 120 and generating residuals by summing the results from up-sampling layer 2); 
quantizing the set of second residuals (FIG. 1, encoder layer N and QN. [0055], “The encoder 300 may transform and quantize the prediction residual 318 at a transformation unit 304 and a quantization unit 306, respectively. By transforming a quantizing the prediction residual 318, the encoder 300 may generate a residual coefficient block 324. The residual coefficient block 324 may be referred to as a quantized residual.” Note, the encoder of FIG. 3 represents the encoding functionalities of each encoder in FIG. 1, including encoder layer N); and 
forming the second enhancement stream from the set of quantized second residuals (FIG. 1, encoded layer-N bitstream N. FIG. 3, quantization 306 and bitstream 320. Note, the encoder of FIG. 3 represents the encoding functionalities of each encoder in FIG. 1, including encoder layer N), wherein the second enhancement stream is at least partially encoded using temporal prediction and further comprises temporal signaling indicating whether temporal prediction is used (FIG. 3 depicts motion prediction (estimation and compensation) 322 (i.e., temporal prediction) and signaling prediction information 328. [0055], “The entropy coder 308 may generate an output video bitstream 320 using the residual coefficient block 324 and the coding mode, motion information, and/or prediction information 328.” FIG. 5, inter-layer prediction processing & management 514); 
forming the hybrid video stream from the base encoded stream, the first enhancement stream and the second enhancement stream ([0061], “A bitstream multiplexer 528 may combine the base layer bitstream 510 and the enhancement layer bitstream 518 to generate a scalable bitstream 530.” FIG. 5 illustrates an example of two layers, which can be expanded to N layers, see FIG. 1)

.
He fails to expressly disclose characterized in that the method further comprises: 
detecting at least one non-motion region in a video frame; and 
causing the set of first residuals but not the set of second residuals to vanish throughout the non-motion region.
However, Ferrara teaches characterized in that the method further comprises: 
detecting at least one non-motion region in a video frame ([0147], “the encoder may analyse the input video and prepare a set of residual masks for each frame of the video. For example, the residual masks may prioritise the area of the picture in which detail is required such as where the action is fast moving rather than the background of the sports field.” [0029], “wherein the configuration data comprises residual masks for one or more of the first and second encoders, wherein respective ones of the first and second encoders are configured to selectively apply the residual masks to respective ones of the first and second set of residuals prior to encoding such that a subset of non-zero values within are not present in respective first and second enhancement level streams”); and 
causing the set of first residuals but not the set of second residuals to vanish throughout the non-motion region ([0029], “wherein the configuration data comprises residual masks for one or more of the first and second encoders, wherein respective ones of the first and second encoders are configured to selectively apply the residual masks to respective ones of the first and second set of residuals prior to encoding such that a subset of non-zero values within are not present in respective first and second enhancement level streams”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to have used a residual mask for eliminating residuals in one enhancement level stream, as taught by Ferrara ([0029]), in He’s invention. One would have been motivated to modify He’s invention, by including Ferrara’s invention, to reduce overall data size and maintaining overall quality as experienced by a user (Ferrara: [0004]).
Regarding claim 2, He and Ferrara disclose every limitation of claim 1, as outlined above. Additionally, Ferrara discloses wherein the set of first residuals is caused to vanish throughout the non-motion region by applying masking to the set of quantized first residuals ([0029], “wherein the configuration data comprises residual masks for one or more of the first and second encoders, wherein respective ones of the first and second encoders are configured to selectively apply the residual masks to respective ones of the first and second set of residuals prior to encoding such that a subset of non-zero values within are not present in respective first and second enhancement level streams.” [0121-0122] describes the RM L-1 control 360-1 can analyze after quantization and adjust the residual control. [0075], “Although residual processing is shown prior to transformation, optionally, the processing step may be arranged elsewhere, for example, later in the encoding process.” Note, later in the encoding process includes after quantizing). The same motivation of claim 1 applies equally as well to claim 2.
Regarding claim 5, He and Ferrara disclose every limitation of claim 1, as outlined above. Additionally, Ferrara discloses wherein the set of first residuals is caused to vanish throughout the non-motion region by applying masking to the difference between the input video and a reconstructed video at the intermediate spatial resolution or by applying masking to the set of first residuals prior to the quantizing ([0029], “wherein the configuration data comprises residual masks for one or more of the first and second encoders, wherein respective ones of the first and second encoders are configured to selectively apply the residual masks to respective ones of the first and second set of residuals prior to encoding such that a subset of non-zero values within are not present in respective first and second enhancement level streams.” FIG. 3 discloses applying the residual processing 350-1 prior to transform, which is prior to quantizing). The same motivation of claim 1 applies equally as well to claim 5.
Regarding claim 6, He and Ferrara disclose every limitation of claim 1, as outlined above. Additionally, Ferrara discloses wherein the set of first residuals is caused to vanish throughout the non-motion region by: in the non-motion region of the video frame, subtracting from the input video, prior to generating the set of first residuals, a predicted difference between the input video and the reconstructed video at the intermediate spatial resolution (FIG. 3, RP 350-1 located prior to processing the residuals and after the differential of the base codec reconstruction and down-sampled input video. FIG. 7, 350-1 after 310-S and prior to processing the residuals). The same motivation of claim 1 applies equally as well to claim 6
Regarding claim 7, He and Ferrara disclose every limitation of claim 1, as outlined above. Additionally, Ferrara discloses wherein each video frame of the first enhancement stream is decodable without reference to any other video frame of the first enhancement stream (FIG. 4 depicts enhancement level 1 decoded by decoder 400-1 without temporal predictive techniques to another references in the same layer). The same motivation of claim 1 applies equally as well to claim 7.
Regarding claim 8, He and Ferrara disclose every limitation of claim 1, as outlined above. Additionally, He discloses wherein providing the second enhancement stream further comprises determining, for each set of second residuals or quantized second residuals in a video frame, whether to use temporal prediction with reference to one or more other video frames, and indicating by the temporal signaling whether temporal prediction is used in said video frame (FIG. 3 depicts motion prediction (estimation and compensation) 322 (i.e., temporal prediction) and signaling prediction information 328. [0055], “The entropy coder 308 may generate an output video bitstream 320 using the residual coefficient block 324 and the coding mode, motion information, and/or prediction information 328.” FIG. 5, inter-layer prediction processing & management 514 outputs the temporal prediction information 526 to the mux 528 to be transmitted with the residuals).
Regarding claim 9, He and Ferrara disclose every limitation of claim 1, as outlined above. Additionally, Ferrara discloses wherein the at least one non-motion region is detected in a video frame of the input video at the original spatial resolution or in a video frame of the input video at the intermediate spatial resolution (FIG. 1 depicts the residual mode (RM) selection operation on the original spatial resolution and being applied at the intermediate spatial resolution). The same motivation of claim 1 applies equally as well to claim 9.
Regarding claim 10, He and Ferrara disclose every limitation of claim 1, as outlined above. Additionally, He discloses wherein the intermediate spatial resolution is finer than the reduced spatial resolution, or the intermediate and reduced spatial resolutions are equal (FIG. 1 depicts the first enhancement layer having a higher spatial resolution (e.g., finer) than the base layer’s resolution. Note, Ferrara alternatively discloses the same spatial resolution for both in FIG. 1).
Regarding claim 11, He and Ferrara disclose every limitation of claim 1, as outlined above. Additionally, Ferrara discloses wherein the first and/or second residuals are generated by applying a transform kernel of size 2 x 2 o 4 x 4 pixels to the difference between the input video and the reconstructed video ([0066], “The transform as described herein may use a directional decomposition transform such as a Hadamard-based transform. Both may comprise a small kernel or matrix that is applied to flattened coding units of residuals (i.e. 2×2 or 4×4 blocks of residuals)”). The same motivation of claim 1 applies equally as well to claim 11.
Regarding claim 13, He and Ferrara disclose every limitation of claim 1, as outlined above. Additionally, Ferrara discloses wherein the set of first residuals and the set of second residuals are quantized using different levels of quantization (FIG. 3, first enhancement layer encoder 350-1 with quantization block 320-1 and second enhancement layer encoder 350-2 with quantization block 320-2. [0157], “The deadzone may be a function of a quantization step width for the residual (e.g. 5 times the step width). The step width may be a dynamic parameter that varies with residual location (e.g. with residual or group of residuals) or a static parameter for all residuals.” [0073], “This may be applied at both levels (1 and 2). For example, quantizing at block 320 may comprise dividing transformed residual values by a step-width. The step-width may be pre-determined, e.g. selected based on a desired level of quantization”). The same motivation of claim 1 applies equally as well to claim 13.
Regarding claim 14, the limitations are the same as those in claim 1; however, written in machine form instead of process form. Therefore, the same rationale of claim 1 applies equally as well to claim 14. Additionally, He discloses a device comprising processing circuitry ([0151], “The processor 2018 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like”).
Regarding claim 15, the limitations are the same as those in claim 1; however, written as a product instead of process form. Therefore, the same rationale of claim 1 applies equally as well to claim 15. Additionally, He discloses a non-transitory computer-readable storage medium having stored thereon a computer program comprising instructions which, when the program is executed by processing circuitry, cause the processing circuitry to carry a method ([0180], “computer program, software, or firmware incorporated in a computer-readable medium for execution by a computer or processor”).
Claim(s) 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Publication No. 2016/0065976 A1 (hereinafter “He”) in view of U.S. Publication No. 2022/0191521 A1 (hereinafter “Ferrara”), and further in view of U.S. Publication No. 2022/0385911 A1 (hereinafter “Meardi”).
Regarding claim 12, He and Ferrara disclose every limitation of claim 11, as outlined above. He and Ferrara fail to expressly disclose wherein the transform kernel is a Low-Complexity Enhancement Video Coding, LCEVC, transform kernel.
However, Meardi teaches wherein the transform kernel is a Low-Complexity Enhancement Video Coding, LCEVC, transform kernel ([0024] describes the tier-based hierarchical format as “MPEG-5 Part 2 LCEVC (“Low Complexity Enhancement Video Coding”)”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to have complied with LCEVC, as taught by Meardi ([0024]), in He and Ferrara’s invention. One would have been motivated to modify He and Ferrara’s invention, by including Meardi’s invention, to comply with the latest compression scheme for higher coding efficiencies.

There is no prior-art rejection for claims 3 and 4; however, they are both rejected under 35 U.S.C. § 112(b), as outlined above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
U.S. Publication No. 2024/0155132 A1 – Discloses generating scalable layers with residual masking, see FIG. 7.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to STUART D BENNETT whose telephone number is (571)272-0677. The examiner can normally be reached Monday - Friday from 9:00 AM - 5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Vaughn can be reached at 571-272-3922. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/STUART D BENNETT/Examiner, Art Unit 2481

Read full office action

Prosecution Timeline

May 03, 2024

Application Filed

Feb 07, 2026

Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/739,606

Patent 12574559

ENCODER, A DECODER AND CORRESPONDING METHODS FOR ADAPTIVE LOOP FILTER ADAPTATION PARAMETER SET SIGNALING

2y 5m to grant Granted Mar 10, 2026

17/683,390

Patent 12568300

ELECTRONIC APPARATUS, METHOD FOR CONTROLLING ELECTRONIC APPARATUS, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM FOR GUI CONTROL ON A DISPLAY

2y 5m to grant Granted Mar 03, 2026

18/905,627

Patent 12563191

CROSS-COMPONENT SAMPLE OFFSET

2y 5m to grant Granted Feb 24, 2026

18/606,687

Patent 12542925

METHOD AND DEVICE FOR INTRA-PREDICTION

2y 5m to grant Granted Feb 03, 2026

18/995,194

Patent 12542934

ZERO-DELAY PANORAMIC VIDEO BIT RATE CONTROL METHOD CONSIDERING TEMPORAL DISTORTION PROPAGATION

2y 5m to grant Granted Feb 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

69%

Grant Probability

54%

With Interview (-15.0%)

2y 5m

Median Time to Grant

Low

PTA Risk

Based on 355 resolved cases by this examiner. Grant probability derived from career allow rate.