Last updated: May 29, 2026
Application No. 17/993,533
Image Encoding and Decoding Method and Apparatus

Non-Final OA §103
Filed
Nov 23, 2022
Priority
May 26, 2020 — continuation of PCTCN2020092408
Examiner
TRAN, DUY ANH
Art Unit
2674
Tech Center
2600 — Communications
Assignee
Huawei Technologies Co., Ltd.
OA Round
2 (Non-Final)
Interview Optional

— +18.4% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 80% grant rate with +18.4% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 133 resolved cases, 2023–2026
Examiner Intelligence

TRAN, DUY ANH View full profile →
Grants 80% — above average
Career Allowance Rate
107 granted / 133 resolved
+18.5% vs TC avg
Strong +18% interview lift
Without
With
+18.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
21 currently pending
Career history
162
Total Applications
across all art units
Statute-Specific Performance

§101
1.0%
-39.0% vs TC avg
§103
81.7%
+41.7% vs TC avg
§102
12.2%
-27.8% vs TC avg
§112
3.4%
-36.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 133 resolved cases
Office Action

§103
DETAILED ACTION
This Action is in response to Applicant’s response filed on 08/19/2025. Claims 2-3 and 11-20 are canceled.  Claims 1, 4-10 and newly adding claims 21-30 are still pending in the present application.  This Action is made FINAL.

Response to Arguments
Applicant's arguments filed on 08/19/2025 have been fully considered but are moot in view of the new ground(s) rejection in view of Mukherjee et al (U.S. 20110268186 A1; Mukherjee).
Claim Status
Claim(s) 1, 4-10 and 21-30 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lu et al (U.S. 20130223524 A1; Lu), in view of  Mukherjee et al (U.S. 20110268186 A1; Mukherjee).


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 4-10 and 21-30 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lu et al (U.S. 20130223524 A1; Lu), in view of  Mukherjee et al (U.S. 20110268186 A1; Mukherjee).

Regarding claim 1, Lu discloses a method, (Paragraph 32: “The coding computer system (210) may use different techniques and/or different types of inserted synchronization predicted video frames (270) to allow for synchronization.”)  comprising: 
obtaining a to-be-encoded image (Frames), wherein the to-be-encoded image is divided into a base layer (Figs . 3-5: base layer 302, 402, 502) and at least one enhancement layer (Figs. 3-5: enhancement layer 304, 404, 504); (Paragraph 33: “Referring now to FIG. 3, an example of a regular prediction structure (300) with periodic key frames is illustrated. Each frame can include a base layer (302) and an enhancement layer (304) (or multiple enhancement layers, not shown),”), and 
determining, when receiving feedback information from a decoder side, (Paragraphs 32-34: “For scalable coded video such as H.264 SVC, performance may be improved by analyzing the location of the loss (such as by receiving a notice of data loss and a location (e.g., which frame and/or which layer) of the data loss) and inserting appropriate synchronization information based on the inter-layer dependency and predictive coding structure … the regular prediction structure (300) can start with an instantaneous decoding refresh (IDR) type key frame (310) (frame 0), which is an intra-coded key frame.”) a reconstructed image corresponding to a first frame sequence number and a first layer sequence number indicated in the feedback information as a first reference frame, (Figs. 3-5 and Paragraph 37: “In frame 6, the lost data (360) is in the enhancement layer (304), but the base layer (302) was not lost … , the coding computer system can receive a notice of the lost data and can code and insert a predicted key frame (320) as frame 7. That predicted key frame (320) can include a prediction that references the previous key frame (frame 5), … the coding computer system can receive a notification of the lost data (360) in frame 9, and can respond by coding and inserting a predicted key frame (320) as frame 10 in the bitstream, so that frame 10 can act as a synchronization predicted video frame, as with the predicted key frame (320) coded and inserted as frame 7.”)
performing inter encoding on the base layer based on the first reference frame to obtain a first bitstream of the base layer, (Figs. 3-6: the bitstream of base layer 302,402,502 and  Paragraph 35: “The IDR-type key frame (310) can be followed by regular predicted frames (330) (frames 1, 2, 3, 4, 6, 7, 8, and 9). The regular predicted frames (330) can each include a base layer (302) and an enhancement layer (304). Each base layer (302) of a regular predicted frame (330) can be coded with a prediction that references a highest enhancement layer of the previous frame. Each enhancement layer (304) of a regular predicted frame (330) can be coded with a prediction that also references the highest enhancement layer of the previous frame, and that references the base layer (302) and/or one or more lower enhancement layers of that same frame.”) wherein the first bitstream carries coding reference information, and wherein the coding reference information comprises a second frame sequence number of the first reference frame and a second layer sequence number of the first reference frame; (Figs. 3-5 ; Paragraphs 33-38 ; Paragraph 46: “The encoding computer system can receive (620) a notification of lost data in the bitstream. The lost data can include at least a portion of a reference frame of the bitstream. In response to the notification, the encoding computer system can dynamically encode (630) a synchronization predicted frame with a prediction that references one or more other previously-sent frames in the bitstream without referencing the lost data.”) 
encoding the at least one enhancement layer to obtain a second bitstream of the at least one enhancement layer; (Figs. 3-6: the bitstream of enhancement layer 304,404,504 and Paragraph 38: “the coding computer system can receive a notification of the lost data (360) in frame 9, and can respond by coding and inserting a predicted key frame (320) as frame 10 in the bitstream, so that frame 10 can act as a synchronization predicted video frame, as with the predicted key frame (320) coded and inserted as frame 7.” ; Paragraph 46: “The encoding computer system can receive (620) a notification of lost data in the bitstream. The lost data can include at least a portion of a reference frame of the bitstream. In response to the notification, the encoding computer system can dynamically encode (630) a synchronization predicted frame with a prediction that references one or more other previously-sent frames in the bitstream without referencing the lost data.”)  and 
sending the first bitstream and the second bitstream to the decoder side. (Figs. 6-8; Paragraph 46, Paragraph 52-53: “The technique can include encoding (705) and sending (710) a video bitstream over a computer network to a decoding computer system. … the regular prediction structure can be dynamically modified (730) by encoding and inserting in the bitstream a synchronization predicted frame having a prediction that does not reference the lost data. …Inserting the synchronization predicted frame can include inserting the synchronization predicted frame in the bitstream in a position … the enhancement layer of the reference frame may be a quality enhancement layer or a spatial enhancement layer. The lost data may include at least a portion of the enhancement layer, and the prediction of the synchronization predicted frame may reference a base layer below the enhancement layer without referencing the enhancement layer.”)

However, Lu does not disclose wherein the to-be-encoded image is a sub-image in an entire image frame;  wherein the feedback information comprises location information indicating a location of the sub-image in the entire image frame; 

Mukherjee discloses obtaining a to-be-encoded image, (Fig.1 : source video 105 ; Decimator 115; Paragraph 14: “Source video 105 includes a plurality of frames 200 (depicted in FIG. 2). Frames 200 can include any number of frames. In one embodiment, frames 200 can include frames N-x through N+y. “) wherein the to-be-encoded image is divided into a base layer (Fig.2: base layer bit stream 123) and at least one enhancement layer (Fig.2: enhancement intra (EI) encoder 130 and enhancement inter (EP) encoder 140), and wherein the to-be-encoded image is a sub-image in an entire image frame;  (Paragraphs 16: “Decimator 115 is configured to receive and downsample source video 105. Decimator 115 downsamples on a frame by frame basis. In one embodiment, decimator 115 downsamples by a factor of 2. … Encoder 120 is configured to encode downscaled frames 200 of source video 105.”; it shows that the “downsample source video” is read as “a sub-image”.
determining, when receiving feedback information from a decoder side (Fig.1: Feedback 180) , a reconstructed image corresponding to a first frame sequence number and a first layer sequence number indicated in the feedback information,( Paragraph 28-29: “coder 104 maintains two lists of previously reconstructed full-resolution frames. List A 167 is the list of correctly received and decoded enhancement frames (e.g., reconstructed frame(s) 161) using the enhancement information transmitted. List B is a list of frames (e.g., SSR frame(s) 159) that have been reconstructed using a suitable semi-super-resolution operation using only the base layer and previous List A or List B frames”) wherein the feedback information comprises location information indicating a location of the sub-image in the entire image frame; (Paragraphs 22: “Encoder 102 (in particular, EP encoder 140) uses feedback 180 received from decoder 104 to determine if and when to encode residue 127 (e.g., enhancement layer Laplacian residue) at an appropriate rate. … , if feedback 180 is received, then feedback 180 is used to decide a coding strategy for each region (for example, a block) of a current frame.”; Paragraph 29 feedback 180 indicates how "good" the match is (e.g., on scale of 0-1) on a region-by-region basis.”, it show that “ a region” is interpreted as “location” )
 performing inter encoding on the base layer to obtain a first bitstream of the base layer, ( Fig.1 – Base layer bitstream 123 and Paragraph 18: “ encoder 120 encodes and transmits a base layer bitstream 123. … One example for real-time communication is by use of Reference Picture Selection, where every frame of the base layer bitstream 123 is encoded only based on previously acknowledged frames.”) 
 encoding the at least one enhancement layer to obtain a second bitstream of the at least one enhancement layer; (Fig.1: EL Bitstream 133; EP bitstream 143 and Paragraph 20-21: “EI encoder 130 is configured to encode residue 127 in buffer 135 and subsequently transmit EI bitstream 133 (e.g., EI pictures) to decoder 104.  … EP encoder 140 is configured to encode residue 127 and subsequently transmit EP bitstream 143 (e.g., EP pictures) to decoder 104.”) and
 sending the first bitstream (Fig.1 – Base layer bitstream 123) and the second bitstream(Fig.1: EL Bitstream 133; EP bitstream 143) to the decoder side. (Fig.1 decoder 104 and Paragraph 20-21: “EI encoder 130 is configured to encode residue 127 in buffer 135 and subsequently transmit EI bitstream 133 (e.g., EI pictures) to decoder 104.  … EP encoder 140 is configured to encode residue 127 and subsequently transmit EP bitstream 143 (e.g., EP pictures) to decoder 104.”; Paragraph 26: “Base layer decoder 150 is configured to receive and decode base layer bitstream 123.)
 Therefore, it would been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of  Lu by including the encode downscaled frames and  the decoder computes statistics and feedback information that is taught by Mukherjee, to make the invention that encoding/decoding system using feedback; thus, one of ordinary skilled in the art would have been motivated to combine the references since this will improving a reasonable lower bound on the quality of video received and the information feedback provides valuable clues to the encoder to decide how to code the enhancement residual information, so as to obtain a compact bit-stream. (Mukherjee: Paragraph 56)
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filling date of the claimed invention.

Regarding claim 4, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses wherein the first frame sequence number indicates a preceding nth image frame of the to-be-encoded image wherein n is a positive integer, and wherein the first layer sequence number corresponds to an image layer that has highest quality or resolution and that is successfully decoded by the decoder side from a bitstream of the preceding nth image frame of the to-be-encoded image; an image layer that has highest quality or resolution and that is successfully received by the decoder side from a bitstream of the preceding nth image frame of the to-be-encoded image; or an image layer that is determined by the decoder side to have highest quality or resolution and that is to be decoded from a bitstream of the preceding nth image frame of the to-be-encoded image. (Paragraphs 33- 35: “an enhancement layer may add additional quality features (e.g., smaller quantization step size) and/or higher spatial resolution. … The IDR-type key frame (310) can be followed by regular predicted frames (330) (frames 1, 2, 3, 4, 6, 7, 8, and 9). The regular predicted frames (330) can each include a base layer (302) and an enhancement layer (304). Each base layer (302) of a regular predicted frame (330) can be coded with a prediction that references a highest enhancement layer of the previous frame. Each enhancement layer (304) of a regular predicted frame (330) can be coded with a prediction that also references the highest enhancement layer of the previous frame, … The regular prediction structure (300) of FIG. 3 can include periodic insertion of predicted key frames (320) (frames 5 and 10). Each predicted key frame (320) can include a prediction that references other key frames (320 and/or 310), but does not reference regular predicted frames.”)

Regarding claim 5, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses wherein after obtaining the to-be-encoded image, the method further comprises performing, when the feedback information is not received or the feedback information comprises identification information indicating a receiving failure or a decoding failure, inter encoding on the base layer based on a third reference frame, wherein the third reference frame is for a base layer of a previous image frame of the to-be-encoded image.(Figs. 3-5 and Paragraphs 37-38: “the coding computer system can receive a notice of the lost data and can code and insert a predicted key frame (320) as frame 7. That predicted key frame (320) can include a prediction that references the previous key frame (frame 5), …. the coding computer system can receive a notification of the lost data (360) in frame 9, and can respond by coding and inserting a predicted key frame (320) as frame 10 in the bitstream, so that frame 10 can act as a synchronization predicted video frame, as with the predicted key frame (320) coded and inserted as frame 7. Because the predicted key frames (320) use some inter-frame prediction, they may be coded more efficiently than would intra-coded frames, such as IDR-type key frames, and they can still allow for synchronization and cutting off of drift from lost data.”)

Regarding claim 6, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses after obtaining the to-be-encoded image, the method further comprises performing, when the feedback information is not received or the feedback information comprises identification information indicating a receiving failure or a decoding failure, intra encoding on the base layer. (Figs. 3-5: Paragraphs 42-43: “In response to receiving a notice of the lost data (560) in the enhancement layer (504) of frame 6, the coding computer system can code and insert an anchor predicted frame (520) as frame 7. The anchor predicted frame can include a base layer (502) with a prediction that references the base layer (502) of frame 6. The base layer (502) of the anchor predicted frame (520) can be coded as in the regular predicted frames (520), while the enhancement layer (504) of frame 7 can include a prediction that is limited to only including intra-frame references. … In response to receiving a notification of lost data (560) in the base layer (502) of frame 9, the coding computer system can code and insert an IDR-type key frame (510) as frame 10 to cut off drift from the lost data (560) in frame 9”)

Regarding claim 7, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses encoding the at least one enhancement layer comprises performing inter encoding on a first enhancement layer based on a second reference frame to obtain a third bitstream of the first enhancement layer, wherein the first enhancement layer is any one of the at least one enhancement layer, wherein the second reference frame is a second reconstructed image corresponding to a first image layer, (Figs. 3-5 and Paragraphs 37-38: “the coding computer system can receive a notice of the lost data and can code and insert a predicted key frame (320) as frame 7. That predicted key frame (320) can include a prediction that references the previous key frame (frame 5), …. in frame 9, there is lost data (360) in the base layer (302) of frame 9. Because data in the base layer (302) is lost, and the enhancement layer (304) of frame 9 includes prediction that references the base layer (302), the enhancement layer (304) cannot be correctly decoded … the coding computer system can receive a notification of the lost data (360) in frame 9, and can respond by coding and inserting a predicted key frame (320) as frame 10 in the bitstream, so that frame 10 can act as a synchronization predicted video frame, as with the predicted key frame (320) coded and inserted as frame 7. Because the predicted key frames (320) use some inter-frame prediction, they may be coded more efficiently than would intra-coded frames, such as IDR-type key frames, and they can still allow for synchronization and cutting off of drift from lost data.”) and wherein a quality or a resolution of the first image layer is lower than a quality or a resolution of the first enhancement layer.  (Figs. 3-5 and Paragraph 33: “an enhancement layer may add additional quality features (e.g., smaller quantization step size) and/or higher spatial resolution.”)

Regarding claim 8, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses the first image layer is lower than the first enhancement layer or is the base layer. (Paragraph 35: “Each enhancement layer (304) of a regular predicted frame (330) can be coded with a prediction that also references the highest enhancement layer of the previous frame, and that references the base layer (302) and/or one or more lower enhancement layers of that same frame.”; Figs. 3-5 shows base layer and enhancement layer)

Regarding claim 9, Lu, as modified by Mukherjee discloses all the claims invention. Mukherjee further discloses during encoding the at least one enhancement layer, the method further comprises buffering reconstructed images respectively corresponding to the base layer and the at least one enhancement layer. (Paragraph 28: “List A 167 is the list of correctly received and decoded enhancement frames (e.g., reconstructed frame(s) 161) using the enhancement information transmitted. List B is a list of frames (e.g., SSR frame(s) 159) that have been reconstructed using a suitable semi-super-resolution operation using only the base layer and previous List A or List B frames.”; Paragraph 45: “in response to feedback from a receiver, wherein the feedback is based on the base layer of the current frame and previous SRR versions of base layer frames (e.g., List B 157) and correctly received enhancement frames (list A, 167): coding strategy for each block of the current frame is determined; and one or more of source coding and Wyner-Ziv coding is utilized to code.”)

Regarding claim 10, Lu, as modified by Mukherjee discloses all the claims invention. Mukherjee further discloses before determining the reconstructed image the method further comprises: monitoring the feedback information within a specified duration; and determining, when the feedback information is received within the specified duration, that the feedback information is received. (Paragraphs 38-39: “ In response to either enhancement information either received or not received with an allotted time, as described above, Lists A and B are updated so as to keep only the latest certain number of frames in each list. … n a video conferencing application only a few hundred milliseconds is acceptable, so if the RTT and encoding/decoding time is less than this, then the A list would be output; otherwise the B list must be used as output.” ; Paragraph 45: “in response to feedback from a receiver, wherein the feedback is based on the base layer of the current frame and previous SRR versions of base layer frames (e.g., List B 157) and correctly received enhancement frames (list A, 167): coding strategy for each block of the current frame is determined; and one or more of source coding and Wyner-Ziv coding is utilized to code.”)

Regarding claim 21, Lu discloses a source device encoder (Paragraph 18: “one or more such computing environments can be used as a coding computer environment, a decoding computer environment, and/or a server that facilitates transmission of a video bitstream between the coding computer system and one or more decoding computer systems.”) comprising: a memory configured to store instructions; and one or more processors coupled to the memory and configured to execute the instructions to cause the source device encoder (Paragraph 20: “the computing environment (100) includes at least one processing unit or processor (110) and memory (120). … The processing unit (110) executes computer-executable instructions and may be a real or a virtual processor. … The memory (120) stores software (180) implementing dynamic insertion of synchronization predicted video frames.”) to: 
obtain a to-be-encoded image (Frames), wherein the to-be-encoded image is divided into a base layer (Figs . 3-5: base layer 302, 402, 502) and at least one enhancement layer (Figs. 3-5: enhancement layer 304, 404, 504); (Paragraph 33: “Referring now to FIG. 3, an example of a regular prediction structure (300) with periodic key frames is illustrated. Each frame can include a base layer (302) and an enhancement layer (304) (or multiple enhancement layers, not shown),”), and 
determine, when receiving feedback information from a decoder side, (Paragraphs 32-34: “For scalable coded video such as H.264 SVC, performance may be improved by analyzing the location of the loss (such as by receiving a notice of data loss and a location (e.g., which frame and/or which layer) of the data loss) and inserting appropriate synchronization information based on the inter-layer dependency and predictive coding structure … the regular prediction structure (300) can start with an instantaneous decoding refresh (IDR) type key frame (310) (frame 0), which is an intra-coded key frame.”) a reconstructed image corresponding to a first frame sequence number and a first layer sequence number indicated in the feedback information as a first reference frame, (Figs. 3-5 and Paragraph 37: “In frame 6, the lost data (360) is in the enhancement layer (304), but the base layer (302) was not lost … , the coding computer system can receive a notice of the lost data and can code and insert a predicted key frame (320) as frame 7. That predicted key frame (320) can include a prediction that references the previous key frame (frame 5), … the coding computer system can receive a notification of the lost data (360) in frame 9, and can respond by coding and inserting a predicted key frame (320) as frame 10 in the bitstream, so that frame 10 can act as a synchronization predicted video frame, as with the predicted key frame (320) coded and inserted as frame 7.”)
perform inter encoding on the base layer based on the first reference frame to obtain a first bitstream of the base layer, (Figs. 3-6: the bitstream of base layer 302,402,502 and  Paragraph 35: “The IDR-type key frame (310) can be followed by regular predicted frames (330) (frames 1, 2, 3, 4, 6, 7, 8, and 9). The regular predicted frames (330) can each include a base layer (302) and an enhancement layer (304). Each base layer (302) of a regular predicted frame (330) can be coded with a prediction that references a highest enhancement layer of the previous frame. Each enhancement layer (304) of a regular predicted frame (330) can be coded with a prediction that also references the highest enhancement layer of the previous frame, and that references the base layer (302) and/or one or more lower enhancement layers of that same frame.”) wherein the first bitstream carries coding reference information, and wherein the coding reference information comprises a second frame sequence number of the first reference frame and a second layer sequence number of the first reference frame; (Figs. 3-5 ; Paragraphs 33-38 ; Paragraph 46: “The encoding computer system can receive (620) a notification of lost data in the bitstream. The lost data can include at least a portion of a reference frame of the bitstream. In response to the notification, the encoding computer system can dynamically encode (630) a synchronization predicted frame with a prediction that references one or more other previously-sent frames in the bitstream without referencing the lost data.”) 
encode the at least one enhancement layer to obtain a second bitstream of the at least one enhancement layer; (Figs. 3-6: the bitstream of enhancement layer 304,404,504 and Paragraph 38: “the coding computer system can receive a notification of the lost data (360) in frame 9, and can respond by coding and inserting a predicted key frame (320) as frame 10 in the bitstream, so that frame 10 can act as a synchronization predicted video frame, as with the predicted key frame (320) coded and inserted as frame 7.” ; Paragraph 46: “The encoding computer system can receive (620) a notification of lost data in the bitstream. The lost data can include at least a portion of a reference frame of the bitstream. In response to the notification, the encoding computer system can dynamically encode (630) a synchronization predicted frame with a prediction that references one or more other previously-sent frames in the bitstream without referencing the lost data.”)  and 
send the first bitstream and the second bitstream to the decoder side. (Figs. 6-8; Paragraph 46, Paragraph 52-53: “The technique can include encoding (705) and sending (710) a video bitstream over a computer network to a decoding computer system. … the regular prediction structure can be dynamically modified (730) by encoding and inserting in the bitstream a synchronization predicted frame having a prediction that does not reference the lost data. …Inserting the synchronization predicted frame can include inserting the synchronization predicted frame in the bitstream in a position … the enhancement layer of the reference frame may be a quality enhancement layer or a spatial enhancement layer. The lost data may include at least a portion of the enhancement layer, and the prediction of the synchronization predicted frame may reference a base layer below the enhancement layer without referencing the enhancement layer.”)

However, Lu does not disclose wherein the to-be-encoded image is a sub-image in an entire image frame;  wherein the feedback information comprises location information indicating a location of the sub-image in the entire image frame; 

Mukherjee discloses obtain a to-be-encoded image, (Fig.1 : source video 105 ; Decimator 115; Paragraph 14: “Source video 105 includes a plurality of frames 200 (depicted in FIG. 2). Frames 200 can include any number of frames. In one embodiment, frames 200 can include frames N-x through N+y. “) wherein the to-be-encoded image is divided into a base layer (Fig.2: base layer bit stream 123) and at least one enhancement layer (Fig.2: enhancement intra (EI) encoder 130 and enhancement inter (EP) encoder 140), and wherein the to-be-encoded image is a sub-image in an entire image frame;  (Paragraphs 16: “Decimator 115 is configured to receive and downsample source video 105. Decimator 115 downsamples on a frame by frame basis. In one embodiment, decimator 115 downsamples by a factor of 2. … Encoder 120 is configured to encode downscaled frames 200 of source video 105.”; it shows that the “downsample source video” is read as “a sub-image”.
determine, when receiving feedback information from a decoder side (Fig.1: Feedback 180) , a reconstructed image corresponding to a first frame sequence number and a first layer sequence number indicated in the feedback information,( Paragraph 28-29: “coder 104 maintains two lists of previously reconstructed full-resolution frames. List A 167 is the list of correctly received and decoded enhancement frames (e.g., reconstructed frame(s) 161) using the enhancement information transmitted. List B is a list of frames (e.g., SSR frame(s) 159) that have been reconstructed using a suitable semi-super-resolution operation using only the base layer and previous List A or List B frames”) wherein the feedback information comprises location information indicating a location of the sub-image in the entire image frame; (Paragraphs 22: “Encoder 102 (in particular, EP encoder 140) uses feedback 180 received from decoder 104 to determine if and when to encode residue 127 (e.g., enhancement layer Laplacian residue) at an appropriate rate. … , if feedback 180 is received, then feedback 180 is used to decide a coding strategy for each region (for example, a block) of a current frame.”; Paragraph 29 feedback 180 indicates how "good" the match is (e.g., on scale of 0-1) on a region-by-region basis.”, it show that “ a region” is interpreted as “location” )
 perform inter encoding on the base layer to obtain a first bitstream of the base layer, ( Fig.1 – Base layer bitstream 123 and Paragraph 18: “ encoder 120 encodes and transmits a base layer bitstream 123. … One example for real-time communication is by use of Reference Picture Selection, where every frame of the base layer bitstream 123 is encoded only based on previously acknowledged frames.”) 
 encode the at least one enhancement layer to obtain a second bitstream of the at least one enhancement layer; (Fig.1: EL Bitstream 133; EP bitstream 143 and Paragraph 20-21: “EI encoder 130 is configured to encode residue 127 in buffer 135 and subsequently transmit EI bitstream 133 (e.g., EI pictures) to decoder 104.  … EP encoder 140 is configured to encode residue 127 and subsequently transmit EP bitstream 143 (e.g., EP pictures) to decoder 104.”) and
 send the first bitstream (Fig.1 – Base layer bitstream 123) and the second bitstream(Fig.1: EL Bitstream 133; EP bitstream 143) to the decoder side. (Fig.1 decoder 104 and Paragraph 20-21: “EI encoder 130 is configured to encode residue 127 in buffer 135 and subsequently transmit EI bitstream 133 (e.g., EI pictures) to decoder 104.  … EP encoder 140 is configured to encode residue 127 and subsequently transmit EP bitstream 143 (e.g., EP pictures) to decoder 104.”; Paragraph 26: “Base layer decoder 150 is configured to receive and decode base layer bitstream 123.)
 Therefore, it would been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of  Lu by including the encode downscaled frames and  the decoder computes statistics and feedback information that is taught by Mukherjee, to make the invention that encoding/decoding system using feedback; thus, one of ordinary skilled in the art would have been motivated to combine the references since this will improving a reasonable lower bound on the quality of video received and the information feedback provides valuable clues to the encoder to decide how to code the enhancement residual information, so as to obtain a compact bit-stream. (Mukherjee: Paragraph 56)
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filling date of the claimed invention.
Regarding claim 22, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses wherein the first frame sequence number indicates a preceding nth image frame of the to-be-encoded image, wherein n is a positive integer, and wherein the first layer sequence number corresponds to: an image layer that has highest quality or resolution and that is successfully decoded by the decoder side from a bitstream of the preceding nth image frame of the to-be-encoded image; an image layer that has highest quality or resolution and that is successfully received by the decoder side from a bitstream of the preceding nth image frame of the to-be-encoded image; or an image layer that is determined by the decoder side to have highest quality or resolution and that is to be decoded from a bitstream of the preceding nth image frame of the to-be-encoded image. (Paragraphs 33- 35: “an enhancement layer may add additional quality features (e.g., smaller quantization step size) and/or higher spatial resolution. … The IDR-type key frame (310) can be followed by regular predicted frames (330) (frames 1, 2, 3, 4, 6, 7, 8, and 9). The regular predicted frames (330) can each include a base layer (302) and an enhancement layer (304). Each base layer (302) of a regular predicted frame (330) can be coded with a prediction that references a highest enhancement layer of the previous frame. Each enhancement layer (304) of a regular predicted frame (330) can be coded with a prediction that also references the highest enhancement layer of the previous frame, … The regular prediction structure (300) of FIG. 3 can include periodic insertion of predicted key frames (320) (frames 5 and 10). Each predicted key frame (320) can include a prediction that references other key frames (320 and/or 310), but does not reference regular predicted frames.”)

Regarding claim 23, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses after obtaining the to-be-encoded image, the one or more processors are further configured to execute the instructions to cause the source device encoder to perform, when the feedback information is not received or the feedback information comprises identification information indicating a receiving failure or a decoding failure, inter encoding on the base layer based on a third reference frame, wherein the third reference frame is for a base layer of a previous image frame of the to-be-encoded image. (Figs. 3-5 and Paragraphs 37-38: “the coding computer system can receive a notice of the lost data and can code and insert a predicted key frame (320) as frame 7. That predicted key frame (320) can include a prediction that references the previous key frame (frame 5), …. the coding computer system can receive a notification of the lost data (360) in frame 9, and can respond by coding and inserting a predicted key frame (320) as frame 10 in the bitstream, so that frame 10 can act as a synchronization predicted video frame, as with the predicted key frame (320) coded and inserted as frame 7. Because the predicted key frames (320) use some inter-frame prediction, they may be coded more efficiently than would intra-coded frames, such as IDR-type key frames, and they can still allow for synchronization and cutting off of drift from lost data.”)

Regarding claim 24, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses after obtaining the to-be-encoded image, the one or more processors are further configured to execute the instructions to cause the source device encoder to perform, when the feedback information is not received or the feedback information comprises identification information indicating a receiving failure or a decoding failure, intra encoding on the base layer.  (Figs. 3-5: Paragraphs 42-43: “In response to receiving a notice of the lost data (560) in the enhancement layer (504) of frame 6, the coding computer system can code and insert an anchor predicted frame (520) as frame 7. The anchor predicted frame can include a base layer (502) with a prediction that references the base layer (502) of frame 6. The base layer (502) of the anchor predicted frame (520) can be coded as in the regular predicted frames (520), while the enhancement layer (504) of frame 7 can include a prediction that is limited to only including intra-frame references. … In response to receiving a notification of lost data (560) in the base layer (502) of frame 9, the coding computer system can code and insert an IDR-type key frame (510) as frame 10 to cut off drift from the lost data (560) in frame 9”)

Regarding claim 25, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses the one or more processors are further configured to execute the instructions to cause the source device encoder to encode the at least one enhancement layer by performing inter encoding on a first enhancement layer based on a second reference frame to obtain a third bitstream of the first enhancement layer, wherein the first enhancement layer is any one of the at least one enhancement layer, wherein the second reference frame is a second reconstructed image corresponding to a first image layer, (Figs. 3-5 and Paragraphs 37-38: “the coding computer system can receive a notice of the lost data and can code and insert a predicted key frame (320) as frame 7. That predicted key frame (320) can include a prediction that references the previous key frame (frame 5), …. in frame 9, there is lost data (360) in the base layer (302) of frame 9. Because data in the base layer (302) is lost, and the enhancement layer (304) of frame 9 includes prediction that references the base layer (302), the enhancement layer (304) cannot be correctly decoded … the coding computer system can receive a notification of the lost data (360) in frame 9, and can respond by coding and inserting a predicted key frame (320) as frame 10 in the bitstream, so that frame 10 can act as a synchronization predicted video frame, as with the predicted key frame (320) coded and inserted as frame 7. Because the predicted key frames (320) use some inter-frame prediction, they may be coded more efficiently than would intra-coded frames, such as IDR-type key frames, and they can still allow for synchronization and cutting off of drift from lost data.”)  and wherein a quality or a resolution of the first image layer is lower than a quality or a resolution of the first enhancement layer. (Figs. 3-5 and Paragraph 33: “an enhancement layer may add additional quality features (e.g., smaller quantization step size) and/or higher spatial resolution.”)

Regarding claim 26, Lu, as modified by Mukherjee discloses all the claims invention. Lu further discloses the first image layer is lower than the first enhancement layer or is the base layer. (Paragraph 35: “Each enhancement layer (304) of a regular predicted frame (330) can be coded with a prediction that also references the highest enhancement layer of the previous frame, and that references the base layer (302) and/or one or more lower enhancement layers of that same frame.”; Figs. 3-5 shows base layer and enhancement layer)

Regarding claim 27, Lu, as modified by Mukherjee discloses all the claims invention. Mukherjee further discloses during encoding the at least one enhancement layer, the one or more processors are further configured to execute the instructions to cause the source device encoder to buffer reconstructed images corresponding to the base layer and the at least one enhancement layer. (Paragraph 28: “List A 167 is the list of correctly received and decoded enhancement frames (e.g., reconstructed frame(s) 161) using the enhancement information transmitted. List B is a list of frames (e.g., SSR frame(s) 159) that have been reconstructed using a suitable semi-super-resolution operation using only the base layer and previous List A or List B frames.”; Paragraph 45: “in response to feedback from a receiver, wherein the feedback is based on the base layer of the current frame and previous SRR versions of base layer frames (e.g., List B 157) and correctly received enhancement frames (list A, 167): coding strategy for each block of the current frame is determined; and one or more of source coding and Wyner-Ziv coding is utilized to code.”)

Regarding claim 28, Lu, as modified by Mukherjee discloses all the claims invention. Mukherjee further discloses before determining the reconstructed image, the one or more processors are further configured to execute the instructions to cause the source device encoder to: monitor the feedback information within a specified duration; and determine, when the feedback information is received within the specified duration, that the feedback information is received. Paragraphs 38-39: “ In response to either enhancement information either received or not received with an allotted time, as described above, Lists A and B are updated so as to keep only the latest certain number of frames in each list. … n a video conferencing application only a few hundred milliseconds is acceptable, so if the RTT and encoding/decoding time is less than this, then the A list would be output; otherwise the B list must be used as output.” ; Paragraph 45: “in response to feedback from a receiver, wherein the feedback is based on the base layer of the current frame and previous SRR versions of base layer frames (e.g., List B 157) and correctly received enhancement frames (list A, 167): coding strategy for each block of the current frame is determined; and one or more of source coding and Wyner-Ziv coding is utilized to code.”)

Regarding claim 29, Lu, as modified by Mukherjee discloses all the claims invention. Mukherjee further discloses the first frame sequence number indicates a preceding nth image frame of the to-be-encoded image, wherein n is a positive integer, (Fig.2 -Paragraph 14: “Frames 200 can include any number of frames. In one embodiment, frames 200 can include frames N-x through N+y. In particular, frames N-3 through N-1 are previous encoded frames transmitted to decoder 104 and frame N is a current frame.”) wherein the sub-image is one of a plurality of sub-images in the entire image frame, wherein each of the sub-images correspond to one of a plurality of layer sequence numbers, (Paragraphs 16-17: “Decimator 115 is configured to receive and downsample source video 105. Decimator 115 downsamples on a frame by frame basis. In one embodiment, decimator 115 downsamples by a factor of 2 … Encoder 120 is configured to encode downscaled frames 200 of source video 105.”)  wherein the first reference frame comprises the layer sequence numbers, and wherein each of the layer sequence numbers corresponds to: an image layer that has highest quality or resolution and that is successfully decoded by the decoder side from a bitstream of the preceding nth image frame of the corresponding sub-image; an image layer that has highest quality or resolution and that is successfully received by the decoder side from a bitstream of the preceding nth image frame of the corresponding sub-image; or an image layer that is determined by the decoder side to have highest quality or resolution and that is to be decoded from a bitstream of the preceding nth image frame of the corresponding sub-image. (Paragraphs 27-30: “Semi-Super-Resolution (SSR) predictor 155 is configured to receive reconstructed base layer 153 and generate a SSR frame 159 by using frames in current List A 167 and List B 157. … decoder 104 maintains two lists of previously reconstructed full-resolution frames. List A 167 is the list of correctly received and decoded enhancement frames (e.g., reconstructed frame(s) 161) using the enhancement information transmitted. List B is a list of frames (e.g., SSR frame(s) 159) that have been reconstructed using a suitable semi-super-resolution operation using only the base layer and previous List A or List B frames … Comparator 170 is configured to generate feedback 180 and transmit feedback 180 to encoder 102 … feedback 180 indicates how "good" the match is (e.g., on scale of 0-1) on a region-by-region basis. In particular, in the process of matching required for the semi-super-resolution operation, comparator 170 also computes a confidence measure or a goodness of match metric for each region by searching frames in the lists of references.”)

Regarding claim 30, Lu, as modified by Mukherjee discloses all the claims invention. Mukherjee further discloses the first frame sequence number indicates a preceding nth image frame of the to-be-encoded image, wherein n is a positive integer, (Fig.2 -Paragraph 14: “Frames 200 can include any number of frames. In one embodiment, frames 200 can include frames N-x through N+y. In particular, frames N-3 through N-1 are previous encoded frames transmitted to decoder 104 and frame N is a current frame.”) wherein the sub-image is one of a plurality of sub-images in the entire image frame, wherein each of the sub- images correspond to one of a plurality of layer sequence numbers, (Paragraphs 16-17: “Decimator 115 is configured to receive and downsample source video 105. Decimator 115 downsamples on a frame by frame basis. In one embodiment, decimator 115 downsamples by a factor of 2 … Encoder 120 is configured to encode downscaled frames 200 of source video 105.”) wherein the first reference frame comprises the layer sequence numbers, and wherein each of the layer sequence numbers corresponds to: an image layer that has highest quality or resolution and that is successfully decoded by the decoder side from a bitstream of the preceding nth image frame of the corresponding sub-image; an image layer that has highest quality or resolution and that is successfully received by the decoder side from a bitstream of the preceding nth image frame of the corresponding sub-image; or an image layer that is determined by the decoder side to have highest quality or resolution and that is to be decoded from a bitstream of the preceding nth image frame of the corresponding sub-image. (Paragraphs 27-30: “Semi-Super-Resolution (SSR) predictor 155 is configured to receive reconstructed base layer 153 and generate a SSR frame 159 by using frames in current List A 167 and List B 157. … decoder 104 maintains two lists of previously reconstructed full-resolution frames. List A 167 is the list of correctly received and decoded enhancement frames (e.g., reconstructed frame(s) 161) using the enhancement information transmitted. List B is a list of frames (e.g., SSR frame(s) 159) that have been reconstructed using a suitable semi-super-resolution operation using only the base layer and previous List A or List B frames … Comparator 170 is configured to generate feedback 180 and transmit feedback 180 to encoder 102 … feedback 180 indicates how "good" the match is (e.g., on scale of 0-1) on a region-by-region basis. In particular, in the process of matching required for the semi-super-resolution operation, comparator 170 also computes a confidence measure or a goodness of match metric for each region by searching frames in the lists of references.”)


Relevant Prior Art Directed to State of Art
Wu et al (U.S. 20020150158 A1), “Drifting Reduction And Macroblock-based Control In Progressive Fine Granularity Scalable Video Coding”, teaches about a motion-compensated video encoding scheme employs progressive fine-granularity layered coding to encode macroblocks of video data into frames having multiple layers, including a base layer of comparatively low quality video and multiple enhancement layers of increasingly higher quality video. Some of the enhancement layers in a current frame are predicted from different quality layers in reference frames. The video encoding scheme estimates drifting errors during the encoding and chooses a coding mode for each macroblock in the enhancement layer to maximize high coding efficiency while minimizing drifting errors
He et al (U.S. 20090060035 A1), “Temporal Scalability for Low Delay Scalable Video Coding”, teaches about A method of processing video information which includes receiving encoded video information including an encoded base layer frame and encoded enhanced layer frames for providing temporal scalability, decoding the encoded video information in display order, and using a decoded first enhanced layer frame as a reference frame for decoding a second enhanced layer frame for forward prediction. Processing the video information in display order and using a decoded enhanced layer frame as a reference frame for processing another enhanced layer frame for forward prediction reduces coding latency for achieving temporal scalability for low delay scalable video coding. 
Onno et al (U.S. 20140192860 A1), “Method, Devices, Computer Program, and Information Storage Means for Encoding or Decoding a Scalable Video Sequence”, teaches about a method of encoding or decoding a scalable video sequence of frames encoded in a bit-stream made of at least one lower layer and one upper layer, comprising: decoding a lower layer bitstream to obtain first sample adaptive offset, SAO, parameters defining a first SAO filtering applied to at least one lower layer frame area; and decoding an upper layer bitstream into at least one decoded upper layer frame area, using a second SAO filtering applied to at least one processed frame area of a processed frame based on respective second SAO parameters; wherein at least one flag in the bit-stream indicates that part or all of the second SAO parameters are inferred from the first SAO parameters.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Duy A Tran whose telephone number is (571)272-4887. The examiner can normally be reached Monday-Friday 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ONEAL R MISTRY can be reached at (313)-446-4912. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DUY TRAN/            Examiner, Art Unit 2674                                                                                                                                                                                            

/ONEAL R MISTRY/            Supervisory Patent Examiner, Art Unit 2674
Read full office action
Prosecution Timeline

Show 2 earlier events
May 21, 2025
Non-Final Rejection mailed — §103
Aug 19, 2025
Response Filed
Dec 10, 2025
Final Rejection mailed — §103
Feb 24, 2026
Response after Non-Final Action
Feb 24, 2026
Applicant Interview (Telephonic)
Feb 25, 2026
Examiner Interview Summary
May 11, 2026
Request for Continued Examination
May 12, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

18/332,927
Patent 12632960
CIGAR TOBACCO LEAF HARVESTING MATURITY IDENTIFICATION METHOD AND SYSTEM BASED ON INTEGRATED LEARNING
2y 11m to grant Granted May 19, 2026
18/085,007
Patent 12614277
OUTPUT DEVICE, METHOD, NON-TRANSITORY COMPUTER READABLE RECORDING MEDIUM AND DISPLAY DEVICE
3y 4m to grant Granted Apr 28, 2026
18/035,858
Patent 12608979
GESTURE RECOGNITION APPARATUS AND METHOD FOR RECOGNIZING GESTURE
2y 11m to grant Granted Apr 21, 2026
18/176,497
Patent 12608797
MEDICAL IMAGE DETECTION SYSTEM, TRAINING METHOD AND MEDICAL ANALYZATION METHOD
3y 1m to grant Granted Apr 21, 2026
17/947,989
Patent 12573024
IMAGE AUGMENTATION FOR MACHINE LEARNING BASED DEFECT EXAMINATION
3y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

2-3
Expected OA Rounds
80%
Grant Probability
99%
With Interview (+18.4%)
2y 10m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 133 resolved cases by this examiner. Grant probability derived from career allowance rate.